Instead of doing everything at once which is a hopelessly large task,
this introduces a piecemeal transition that can be done in small
increments at a time.
This is a minimal changeset that keeps the compiler compiling. It only
uses the InternPool for a small set of types.
Behavior tests are not passing.
Air.Inst.Ref and Zir.Inst.Ref are separated into different enums but
compile-time verified to have the same fields in the same order.
The large set of changes is mainly to deal with the fact that most Type
and Value methods now require a Module to be passed in, so that the
InternPool object can be accessed.
Previously we did not write any missing padding bytes after the smallest
field (either tag or payload, depending on alignment). This resulted in
writing too few bytes and not matching the full abisize of the union.
When storing the address after calculating the element's address,
ensure it's stored in a local with the correct type. Previously it
would incorrectly use the element's type, which could be a float for
example and therefore generate invalid WebAssembly code.
This change also introduces a more robust `store` function.
When the paylaod is zero-sized we must ensure a valid pointer
is still returned for the ptr variation of the instruction. This,
because it's valid to have a pointer to a zero-sized value.
In such a case, we simply return the operand.
For packed unions where its abi size is less than or equal to 8 bytes
we store it directly and don't pass it by reference. This means that
when retrieving the field, we will perform shifts and bitcasts to ensure
the correct type is returned. For larger packed unions, we either allocate
a new stack value based on the field type when the field type is also passed
by reference, or load it directly into a local if it's not.
We currently have `isRef` return true for any type of union, including
packed unions. This means we can simply load it from the data section
to the exact type we want. In the future we can optimize it so it works
similarly to packed structs below 64 bits which do not get stored in
the data section and are not passed by ref.
Previously we would only store the payload, but not the actual tag
that was set. This meant miscompilations where it would incorrectly
return the tag value.
This also adds a tiny optimization for payloads which are not `byRef`
by directly storing them based on offset, rather than first calculating
a pointer to an offset.
Previously it was incorrectly assumed that all memcopy's generated by
the `memcpy` AIR instruction had an element size of 1 byte. However,
this would result in miscompilations for pointer's to arrays where
the element size of the array was larger than 1 byte. We now corectly
calculate this size.
* move `ptrBitWidth` from Arch to Target since it needs to know about the abi
* double isn't always 8 bits
* AVR uses 1-byte alignment for everything in GCC
When initializing a packed struct, we must ensure the result local
is zero'd. Previously we would do this by ensuring a new local is
allocated. Although a local is always zero by default, it meant that
if such an initialization was being done inside a loop, it would re-
use that very same local that could potentially still hold a different
value. Because this value is `or`'d with the value, it would result
in a miscompilation. By manually setting this result to 0, we guarantee
the correct behavior.
Previously we incorrectly assumed all memset's to have its element
abi-size be 1 byte. This would set the region of memory incorrectly.
We now have a more efficient loop, as well as support any element
type by re-using the `store` function for each element and moving
the pointer by 1 element.
Previously we would use the address of the slice itself, which would
result in miscompilations and accidently setting the memory region
of the slice itself, rather than based on the `ptr` field.
Previously when lowering a value of `elem_ptr` we would multiply the
abisize of the parent type by the index, rather than the element type.
This would result in an invalid pointer way beyond the correct pointer.
We now also pass the current offset to each recursive call to ensure
we do not miss inner offsets.
When we have a `ret_load` instruction with a zero-sized type which was
not an error, we would not emit any instruction. This resulted in
no `return` instruction and also not correctly resetting the global
stack_pointer.
This commit also enables the regular test runner for the WebAssembly
backend.
A copy was being made of a WValue variable, which meant the call
to `free` would insert the local that was being held by said WValue
was appended to the free list twice. This led to the same local being
reused even though it wasn't free and would lead to it being over-
written by a new value.
Rather than adding all values that were generated in the child branch,
we simply discard them as outer branches cannot refer to values
produced from an inner branch.
This new tag is used for freed locals that are not allowed to have any
remaining references pointing to it. This new tag allows us to easily
identify liveness bugs. Previously we would set the entire region to
`undefined` which would incorrectly set the tag to `function_index`,
making codegen think it was a valid `WValue` while it wasn't.
Make sure to increase the reference count for `intcast` when the
operand doesn't require any casting of the respective WebAssembly type.
Function arguments have a reserved slot, and therefore cannot be
re-used arbitrarily
This fix ensures that when we are shifting left or right,
both operands have the same WebAssembly type. e.g. it's not possible
to shift a 64 bit integer and 32 bit integer together and will fail
WebAssembly's validator. By first coercing the values to the same
type, we ensure we satisfy the validator.
Uses the `atomic.fence` instruction for multi-thread-enabled builds
where the `atomics` feature is enabled for the wasm32 target.
In all other cases, this lowers to a nop.
Implements the lowering of the `@atomicRmw` builtin. Uses the atomic
opcodes when the cpu feature `atomics` is enabled. Otherwise lowers
it to regular instructions. For the operations that do not lower to
a direct atomic opcode, we use a loop in combiantion with a cmpxchg
to ensure the swapping of values is doing atomically.
When the user passes the cpu feature `atomics` to the target triple,
the backend will lower the AIR instruction using opcodes from the
atomics feature instead of manually lowering it.
store:
The value to store may be undefined, in which case the destination
memory region has undefined bytes after this instruction is
evaluated. In such case ignoring this instruction is legal
lowering.
store_safe:
Same as `store`, except if the value to store is undefined, the
memory region should be filled with 0xaa bytes, and any other
safety metadata such as Valgrind integrations should be notified of
this memory region being undefined.
Rather than using a function call to verify if an error fits within
the global error set's length, we now store the error set' size in
the .rodata segment of the linear memory and load that value onto
the stack to check with the integer value.
This implements the safety check for error casts. The instruction
generates a jump table with 2 possibilities. The operand is used
as an index into the jump table. For cases where the value does
not exist within the error set, it will generate a jump to the
'false' block. For cases where it does exist, it will generate
a jump to the 'true' block. By calculating the highest and lowest
value we can keep the jump table smaller, as it doesn't need to
contain an index into the entire error set.
Creates a global undefined symbol when this instruction is called.
The linker will then resolve it as a lazy symbol, ensuring it is
only generated when the symbol was created. In `flush` it will then
generate the function as only then, all errors are known and we can
generate the function body. This logic allows us to re-use the same
functionality of linker-synthetic-functions.