The big-endian logic here was simply incorrect. Luckily, it was also
overcomplicated; after calling `Value.writeToPackedMemory`, there is a
method on `std.math.big.int.Mutable` which just does the correct
endianness load for us.
Integers with padding bits on big-endian targets cannot quite be bitcast
with a trivial memcpy, because the padding bits (which are zext or sext)
are the most-significant, so are at the *lowest* addresses. So to
bitcast to something which doesn't have padding bits, we need to offset
past the padding.
The logic I've added here definitely doesn't handle all possibilities
correctly; I think that would actually be quite complicated. However, it
handles a common case, and so prevents the Zig compiler itself from
being miscompiled on big-endian targets (hence fixing a bootstrapping
problem on big-endian).
A new `Legalize.Feature` tag is introduced for each float bit width
(16/32/64/80/128). When e.g. `soft_f16` is enabled, all arithmetic and
comparison operations on `f16` are converted to calls to the appropriate
compiler_rt function using the new AIR tag `.legalize_compiler_rt_call`.
This includes casts where the source *or* target type is `f16`, or
integer<=>float conversions to or from `f16`. Occasionally, operations
are legalized to blocks because there is extra code required; for
instance, legalizing `@floatFromInt` where the integer type is larger
than 64 bits requires calling an arbitrary-width integer conversion
function which accepts a pointer to the integer, so we need to use
`alloc` to create such a pointer, and store the integer there (after
possibly zero-extending or sign-extending it).
No backend currently uses these new legalizations (and as such, no
backend currently needs to implement `.legalize_compiler_rt_call`).
However, for testing purposes, I tried modifying the self-hosted x86_64
backend to enable all of the soft-float features (and implement the AIR
instruction). This modified backend was able to pass all of the behavior
tests (except for one `@mod` test where the LLVM backend has a bug
resulting in incorrect compiler-rt behavior!), including the tests
specific to the self-hosted x86_64 backend.
`f16` and `f80` legalizations are likely of particular interest to
backend developers, because most architectures do not have instructions
to operate on these types. However, enabling *all* of these legalization
passes can be useful when developing a new backend to hit the ground
running and pass a good amount of tests more easily.
Apple's own headers and tbd files prefer to think of Mac Catalyst as a distinct
OS target. Earlier, when DriverKit support was added to LLVM, it was represented
a distinct OS. So why Apple decided to only represent Mac Catalyst as an ABI in
the target triple is beyond me. But this isn't the first time they've ignored
established target triple norms (see: armv7k and aarch64_32) and it probably
won't be the last.
While doing this, I also audited all Darwin OS prongs throughout the codebase
and made sure they cover all the tags.
There is approximately zero chance of the Zig team ever spending any effort on
supporting Cygwin; the MSVC and MinGW-w64 ABIs are superior in every way that
matters, and not least because they lead to binaries that just run natively on
Windows without needing a POSIX emulation environment installed.
The changes to `codegen.c` are blatant hacks, but the problem they work
around isn't a regression: it's an existing miscompilation. This branch
happened to *expose* that miscompilation in more cases by changing how
an incorrect result is *used*.
I had tried unrolling the loops to avoid requiring the
`vector_store_elem` instruction, but it's arguably a problem to generate
O(N) code for an operation on `@Vector(N, T)`. In addition, that
lowering emitted a lot of `.aggregate_init` instructions, which is
itself a quite difficult operation to codegen.
This requires reintroducing runtime vector indexing internally. However,
I've put it in a couple of instructions which are intended only for use
by `Air.Legalize`, named `legalize_vec_elem_val` (like `array_elem_val`,
but for indexing a vector with a runtime-known index) and
`legalize_vec_store_elem` (like the old `vector_store_elem`
instruction). These are explicitly documented as *not* being emitted by
Sema, so need only be implemented by backends if they actually use an
`Air.Legalize.Feature` which emits them (otherwise they can be marked as
`unreachable`).
I started this diff trying to remove a little dead code from the C
backend, but ended up finding a bunch of dead code sprinkled all over
the place:
* `packed` handling in the C backend which was made dead by `Legalize`
* Representation of pointers to runtime-known vector indices
* Handling for the `vector_store_elem` AIR instruction (now removed)
* Old tuple handling from when they used the InternPool repr of structs
* Straightforward unused functions
* TODOs in the LLVM backend for features which Zig just does not support
The main goal of this change was to avoid emitting the
`vector_store_elem` AIR tag, because this represents an operation which
Zig no longer supports (and hence Sema no longer emits) as of 010d9a6
(because runtime vector indices are now forbidden). Backends should not
need to lower this operation, so I rewrote the legalizations which
emitted it (scalarizations of vector operations) to instead unroll the
loop and hence emit comptime-known vector indices.
In doing this, I actually reworked those legalizations to use a
different strategy; instead of using an `alloc` and storing to
individual vector elements, the vector is constructed by-val, for
instance by performing the scalar operation on all elements and passing
them to an `aggregate_init`. This is vastly simpler to implement in
Legalize, conceptually simpler, and doesn't severely pessimise memory
usage, because a non-optimizing backend will store the full vector on
the stack either way.
Given the above rationale, I also ended up reworking several other
legalizations to use simpler lowerings. The legalizations in question
were bitcast scalarization, `struct_field_val` of `packed struct`s
(where we just bitcast to an integer and perform the appropriate
shift/trunc sequence), and `aggregate_init` of a `packed struct` (also
implemented in terms of integer bitwise operations with bitcasts to and
from the actual types). This hugely simplified some parts of `Legalize`.
So, `Legalize` is now much simpler, and the `vector_store_elem`
instruction is no longer emitted by any part of the compiler so can be
removed in a future commit.
This configuration hasn't had much work put into it yet, so is all but
guaranteed to miscompile or crash. Since users are starting to try out
`-fincremental`, and LLVM is still the default backend in many cases,
it's worth having this warning to avoid bug reports like
https://github.com/ziglang/zig/issues/25873.
To match the new default implementation. In fact, I implemented this by
simply dispatching *to* the default implementation after the debug log
guard; no need to complicate things!
This change removes the ref_start_index from the possible enum values of
Index and OptionalIndex. It is not really a index, but a constant that
tells the offset of static Refs, so lets move it where such constant
belongs i.e. to the Ref.
This seems to work around a very puzzling miscompilation first
present in LLVM 21.x. We already unconditionally add these
clobbers to inline assembly that came from the source, the
valgrind requests should also contain them.