Simplifies the logic, clarifies the comment, and fixes a minor bug,
which is that we exported the Windows ABI name *instead* of the standard
compiler-rt name, but it's meant to be exported *in addition* to the
standard name (this is LLVM's behavior and it is more useful).
The build runner was previously forcing child processes to have their
stderr colorization match the build runner by setting `CLICOLOR_FORCE`
or `NO_COLOR`. This is a nice idea in some cases---for instance a simple
`Run` step which we just expect to exit with code 0 and whose stderr is
not being programmatically inspected---but is a bad idea in others, for
instance if there is a check on stderr or if stderr is captured, in
which case forcing color on the child could cause checks to fail.
Instead, this commit adds a field to `std.Build.Step.Run` which
specifies a behavior for the build runner to employ in terms of
assigning the `CLICOLOR_FORCE` and `NO_COLOR` environment variables. The
default behavior is to set `CLICOLOR_FORCE` if the build runner's output
is colorized and the step's stderr is not captured, and to set
`NO_COLOR` otherwise. Alternatively, colors can be always enabled,
always disabled, always match the build runner, or the environment
variables can be left untouched so they can be manually controlled
through `env_map`.
Notably, this fixes a failure when running `zig build test-cli` in a
TTY (or with colors explicitly enabled). GitHub CI hadn't caught this
because it does not request color, but Codeberg CI now does, and we were
seeing a failure in the `zig init` test because the actual output had
color escape codes in it due to 6d280dc.
Apple's own headers and tbd files prefer to think of Mac Catalyst as a distinct
OS target. Earlier, when DriverKit support was added to LLVM, it was represented
a distinct OS. So why Apple decided to only represent Mac Catalyst as an ABI in
the target triple is beyond me. But this isn't the first time they've ignored
established target triple norms (see: armv7k and aarch64_32) and it probably
won't be the last.
While doing this, I also audited all Darwin OS prongs throughout the codebase
and made sure they cover all the tags.
There is approximately zero chance of the Zig team ever spending any effort on
supporting Cygwin; the MSVC and MinGW-w64 ABIs are superior in every way that
matters, and not least because they lead to binaries that just run natively on
Windows without needing a POSIX emulation environment installed.
It's easy to do FP unwinding from a CPU context: you just report the
captured ip/pc value first, and then unwind from the captured fp value.
All this really needed was a couple of new functions on the
`std.debug.cpu_context` implementations so that we don't need to rely on
`std.debug.Dwarf` to access the captured registers.
Resolves: #25576
The changes to `codegen.c` are blatant hacks, but the problem they work
around isn't a regression: it's an existing miscompilation. This branch
happened to *expose* that miscompilation in more cases by changing how
an incorrect result is *used*.
It turns out we did use these in the C backend. However, it's really
just as easy, if not easier, to replicate the logic directly in C.
Synchronizes stage1/zig.h to make sure the bootstrap doesn't depend on
these functions either. The actual zig1 tarball is unmodified because
regenerating it is unnecessary in this instance.
I had tried unrolling the loops to avoid requiring the
`vector_store_elem` instruction, but it's arguably a problem to generate
O(N) code for an operation on `@Vector(N, T)`. In addition, that
lowering emitted a lot of `.aggregate_init` instructions, which is
itself a quite difficult operation to codegen.
This requires reintroducing runtime vector indexing internally. However,
I've put it in a couple of instructions which are intended only for use
by `Air.Legalize`, named `legalize_vec_elem_val` (like `array_elem_val`,
but for indexing a vector with a runtime-known index) and
`legalize_vec_store_elem` (like the old `vector_store_elem`
instruction). These are explicitly documented as *not* being emitted by
Sema, so need only be implemented by backends if they actually use an
`Air.Legalize.Feature` which emits them (otherwise they can be marked as
`unreachable`).
`__addosi4`, `__addodi4`, `__addoti4`, `__subosi4`, `__subodi4`, and
`__suboti4` were all functions which we invented for no apparent reason.
Neither LLVM, nor GCC, nor the Zig compiler use these functions. It
appears the functions were created in a kind of misunderstanding of an
old language proposal; see https://github.com/ziglang/zig/pull/10824.
There is no benefit to these functions existing; if a Zig compiler
backend needs this operation, it is trivial to implement, and *far*
simpler than calling a compiler-rt routine. Therefore, this commit
deletes them. A small amount of that code was used by other parts of
compiler-rt; the logic is trivial so has just been inlined where needed.
I also chose to quickly implement `__addvdi3` (a standard function)
because it is trivial and we already implement the `sub` parallel.
I started this diff trying to remove a little dead code from the C
backend, but ended up finding a bunch of dead code sprinkled all over
the place:
* `packed` handling in the C backend which was made dead by `Legalize`
* Representation of pointers to runtime-known vector indices
* Handling for the `vector_store_elem` AIR instruction (now removed)
* Old tuple handling from when they used the InternPool repr of structs
* Straightforward unused functions
* TODOs in the LLVM backend for features which Zig just does not support
The main goal of this change was to avoid emitting the
`vector_store_elem` AIR tag, because this represents an operation which
Zig no longer supports (and hence Sema no longer emits) as of 010d9a6
(because runtime vector indices are now forbidden). Backends should not
need to lower this operation, so I rewrote the legalizations which
emitted it (scalarizations of vector operations) to instead unroll the
loop and hence emit comptime-known vector indices.
In doing this, I actually reworked those legalizations to use a
different strategy; instead of using an `alloc` and storing to
individual vector elements, the vector is constructed by-val, for
instance by performing the scalar operation on all elements and passing
them to an `aggregate_init`. This is vastly simpler to implement in
Legalize, conceptually simpler, and doesn't severely pessimise memory
usage, because a non-optimizing backend will store the full vector on
the stack either way.
Given the above rationale, I also ended up reworking several other
legalizations to use simpler lowerings. The legalizations in question
were bitcast scalarization, `struct_field_val` of `packed struct`s
(where we just bitcast to an integer and perform the appropriate
shift/trunc sequence), and `aggregate_init` of a `packed struct` (also
implemented in terms of integer bitwise operations with bitcasts to and
from the actual types). This hugely simplified some parts of `Legalize`.
So, `Legalize` is now much simpler, and the `vector_store_elem`
instruction is no longer emitted by any part of the compiler so can be
removed in a future commit.
This fixes package fetching on Windows.
Previously, `Async/GroupClosure` allocations were only aligned for the
closure struct type, which resulted in panics when `context_alignment`
(or `result_alignment` for that matter) had a greater alignment.
`Clock.real` being defined to return timestamps relative to an
implementation-specific epoch means that there's currently no way for
the user to translate returned timestamps to actual calendar dates
without digging into implementation details of any particular `Io`
implementation. Redefining it to return timestamps relative to
1970-01-01T00:00:00Z fixes this problem.
There are other ways to solve this, such as adding a new vtable function
for returning the implementation-specific epoch, but in terms of
complexity this redefinition is by far the simplest solution and only
amounts to a simple 96-bit integer addition's worth of overhead on OSes
like Windows that use non-POSIX/Unix epochs.
ML-DSA is a post-quantum signature scheme that was recently
standardized by NIST.
Keys and signatures are pretty large, not making it a drop-in
replacement for classical signature schemes.
But if you are shipping keys that may still be used in 10 years
or whenever large quantum computers able to break ECC arrive,
it that ever happens, and you don't have the ability to replace
these keys, ML-DSA is for you.
Performance is great, verification is faster than Ed25519 / ECDSA.
I tried manual vectorization, but it wasn't worth it, the compiler
does at good job at auto-vectorization already.
This configuration hasn't had much work put into it yet, so is all but
guaranteed to miscompile or crash. Since users are starting to try out
`-fincremental`, and LLVM is still the default backend in many cases,
it's worth having this warning to avoid bug reports like
https://github.com/ziglang/zig/issues/25873.
To match the new default implementation. In fact, I implemented this by
simply dispatching *to* the default implementation after the debug log
guard; no need to complicate things!