Store `InternPool.Index` as the key instead which means that an AIR
instruction no longer needs to be burned to store the type, and also
that we can use AutoArrayHashMap instead of an ArrayList, which avoids
storing duplicates into the set, potentially saving CPU time.
This allows some code (like struct initializers) to use interned types
while other code (such as comptime mutation) continues to use legacy
types.
With these changes, an `zig build-obj empty.zig` gets to a crash on
missing interned error union types.
One change worth noting in this commit is that `module.global_error_set`
is no longer kept strictly up-to-date. The previous code reserved
integer error values when dealing with error set types, but this is no
longer needed because the integer values are not needed for semantic
analysis unless `@errorToInt` or `@intToError` are used and therefore
may be assigned lazily.
Also I moved `anyframe` from being represented by `SimpleType` to being
represented by the `none` tag of `anyframe_type` because most code wants
to handle these two types together.
I'm seeing a new assertion trip: the call to `enumTagFieldIndex` in the
implementation of `@Type` is attempting to query the field index of an
union's enum tag, but the type of the enum tag value provided is not the
same as the union's tag type. Most likely this is a problem with type
coercion, since values are now typed.
Another problem is that I added some hacks in std.builtin because I
didn't see any convenient way to access them from Sema. That should
definitely be cleaned up before merging this branch.
Unlike unions and structs, enums are actually *encoded* into the
InternPool directly, rather than using the SegmentedList trick. This
results in them being quite compact, and greatly improved the ergonomics
of using enum types throughout the compiler.
It did however require introducing a new concept to the InternPool which
is an "incomplete" item - something that is added to gain a permanent
Index, but which is then mutated in place. This was necessary because
enum tag values and tag types may reference the namespaces created by
the enum itself, which required constructing the namespace, decl, and
calling analyzeDecl on the decl, which required the decl value, which
required the enum type, which required an InternPool index to be
assigned and for it to be meaningful.
The API for updating enums in place turned out to be quite slick and
efficient - the methods directly populate pre-allocated arrays and
return the information necessary to output the same compilation errors
as before.
This commit changes a lot of `*const Module` to `*Module` to make it
work, since accessing the integer tag type of an enum might need to
mutate the InternPool by adding a new integer type into it.
An alternate strategy would be to pre-heat the InternPool with the
integer tag type when creating an enum type, which would make it so that
intTagType could accept a const Module instead of a mutable one,
asserting that the InternPool already had the integer tag type.
Temporarily used for some unfortunate allocations made by backends that
need to construct pointer types that can't be represented by the
InternPool. Once all types are migrated to be stored in the InternPool,
this can be removed.
Instead of doing everything at once which is a hopelessly large task,
this introduces a piecemeal transition that can be done in small
increments at a time.
This is a minimal changeset that keeps the compiler compiling. It only
uses the InternPool for a small set of types.
Behavior tests are not passing.
Air.Inst.Ref and Zir.Inst.Ref are separated into different enums but
compile-time verified to have the same fields in the same order.
The large set of changes is mainly to deal with the fact that most Type
and Value methods now require a Module to be passed in, so that the
InternPool object can be accessed.
The idea here is that there are two ways we can reference a function at runtime:
* Through a direct call, i.e. where the function is comptime-known
* Through a function pointer
This means we can easily perform a form of rudimentary escape analysis
on functions. If we ever see a `decl_ref` or `ref` of a function, we
have a function pointer, which could "leak" into runtime code, so we
emit the function; but for a plain `decl_val`, there's no need to.
This change means that `comptime { _ = f; }` no longer forces a function
to be emitted, which was used for some things (mainly tests). These use
sites have been replaced with `_ = &f;`, which still triggers analysis
of the function body, since you're taking a pointer to the function.
Resolves: #6256Resolves: #15353
This commit removes the `field_call_bind` and `field_call_bind_named` ZIR
instructions, replacing them with a `field_call` instruction which does the bind
and call in one.
`field_call_bind` is an unfortunate instruction. It's tied into one very
specific usage pattern - its result can only be used as a callee. This means
that it creates a value of a "pseudo-type" of sorts, `bound_fn` - this type used
to exist in Zig, but now we just hide it from the user and have AstGen ensure
it's only used in one way. This is quite silly - `Type` and `Value` should, as
much as possible, reflect real Zig types and values.
It makes sense to instead encode the `a.b()` syntax as its own ZIR instruction,
so that's what we do here. This commit introduces a new instruction,
`field_call`. It's like `call`, but rather than a callee ref, it contains a ref
to the object pointer (`&a` in `a.b()`) and the string field name (`b`). This
eliminates `bound_fn` from the language, and slightly decreases the size of
generated ZIR - stats below.
This commit does remove a few usages which used to be allowed:
- `@field(a, "b")()`
- `@call(.auto, a.b, .{})`
- `@call(.auto, @field(a, "b"), .{})`
These forms used to work just like `a.b()`, but are no longer allowed. I believe
this is the correct choice for a few reasons:
- `a.b()` is a purely *syntactic* form; for instance, `(a.b)()` is not valid.
This means it is *not* inconsistent to not allow it in these cases; the
special case here isn't "a field access as a callee", but rather this exact
syntactic form.
- The second argument to `@call` looks much more visually distinct from the
callee in standard call syntax. To me, this makes it seem strange for that
argument to not work like a normal expression in this context.
- A more practical argument: it's confusing! `@field` and `@call` are used in
very different contexts to standard function calls: the former normally hints
at some comptime machinery, and the latter that you want more precise control
over parts of a function call. In these contexts, you don't want implicit
arguments adding extra confusion: you want to be very explicit about what
you're doing.
Lastly, some stats. I mentioned before that this change slightly reduces the
size of ZIR - this is due to two instructions (`field_call_bind` then `call`)
being replaced with one (`field_call`). Here are some numbers:
+--------------+----------+----------+--------+
| File | Before | After | Change |
+--------------+----------+----------+--------+
| Sema.zig | 4.72M | 4.53M | -4% |
| AstGen.zig | 1.52M | 1.48M | -3% |
| hash_map.zig | 283.9K | 276.2K | -3% |
| math.zig | 312.6K | 305.3K | -2% |
+--------------+----------+----------+--------+
* Avoid redundant words ("found")
- All compile errors are found by the compiler
* Avoid unnecessary prepositions ("with")
- There is a grammatically correct alternate word order without the
preposition.
In semaDecl, it was possible for a new ArenaAllocators state to replace an existing one that
hadn't been freed yet. Instead of the ref_count (which was made redundant by adding
the allocator parameter to `release`), I now store a pointer to the previous arena, if one exists.
This allows a recursive deinit to happen when the last arena created is destroyed.
This fixes a bug where resolveStructLayout to was promoting from stale
value_arena state which was then overwrriten when another ArenaAllocator
higher in the call stack saved its state back. This resulted in the memory
for struct_obj.optmized_order overlapping existing allocations.
My initial fix in c7067ef wasn't sufficient, as it only checked if the struct being
resolved had the same owner as the current sema instance. However, it's
possible for resolveStructLayout to be called when the sema instance
has a different owner, but the struct decl's value_arena is currently in
use higher up in the callstack.
This change introduces ValueArena, which holds the arena state as well as tracks
if an arena has already been promoted from it. This allows callers to use the
value_arena storage without needing to be aware of another user of this same storage
higher up in the call stack.
This implements the safety check for error casts. The instruction
generates a jump table with 2 possibilities. The operand is used
as an index into the jump table. For cases where the value does
not exist within the error set, it will generate a jump to the
'false' block. For cases where it does exist, it will generate
a jump to the 'true' block. By calculating the highest and lowest
value we can keep the jump table smaller, as it doesn't need to
contain an index into the entire error set.