- hash/eql functions moved into a Context object
- *Context functions pass an explicit context
- *Adapted functions pass specialized keys and contexts
- new getPtr() function returns a pointer to value
- remove functions renamed to fetchRemove
- new remove functions return bool
- removeAssertDiscard deleted, use assert(remove(...)) instead
- Keys and values are stored in separate arrays
- Entry is now {*K, *V}, the new KV is {K, V}
- BufSet/BufMap functions renamed to match other set/map types
- fixed iterating-while-modifying bug in src/link/C.zig
We've settled on the nomenclature for the artifacts the compiler
pipeline produces:
1. Tokens
2. AST (Abstract Syntax Tree)
3. ZIR (Zig Intermediate Representation)
4. AIR (Analyzed Intermediate Representation)
5. Machine Code
Renaming `ir` identifiers to `air` will come with the inevitable
air-memory-layout branch that I plan to start after the 0.8.0 release.
When scanDecls happens, we create stub Decl objects that
have not been semantically analyzed. When they get referenced,
they get semantically analyzed.
Before this commit, when they got unreferenced, they were completely
deleted, including deleted from the containing Namespace.
However, if the update did not cause the containing Namespace to get
deleted, for example, if `std.builtin.ExportOptions` is no longer
referenced, but `std.builtin` is still referenced, and then `ExportOptions`
gets referenced again, the Namespace would be incorrectly missing the
Decl, so we get an incorrect "no such member" error.
The solution is to, when dealing with a no longer referenced Decl
objects during an update, clear them to the state they would be in
on a fresh scanDecl, rather than completely deleting them.
Conflicts:
* src/codegen/spirv.zig
* src/link/SpirV.zig
We're going to want to improve the stage2 test harness to print
the source file name when a compile error occurs otherwise std lib
contributors are going to see some confusing CI failures when they cause
stage2 AstGen compile errors.
Previously, ZIR was per-function so we could simply allocate a slice for
all ZIR instructions. However now ZIR is whole-file, so we need a sparse
mapping of ZIR to AIR instructions in order to not waste memory.
Previously, stage2 used a global decl_table for all Decl objects, keyed
by a 16-byte name hash that was hopefully unique. Now, there is a tree
of Namespace objects that own their named Decl objects.
Previously the frontend incorrectly called freeDecl for structs, which
never got allocateDecl called for them. There was simply a missing check
for hasCodeGenBits().
Just like when new parse errors occur during an update, when new AstGen
errors occur during an update, we do not reveal compile errors for Decl
objects which are inside of a newly failed File. Once the File passes
AstGen successfully, it will be compared with the previously succeeded
ZIR and the saved Decl compile errors will be handled properly.
Previously, compile log stored full SrcLoc info, which included absolute
AST node index. This becomes invalid after an incremental compilation.
To make it survive incremental compilation, store an offset from parent
Decl instead.
* Do not report export collision errors until the very end, because it
is possible, during an update, for a new export to be added before an
old one is semantically analyzed to be deleted. In such a case there
should be no compile error.
- Likewise we defer emitting exports until the end when we know for
sure what will happen.
* Sema: Fix not adding a Decl dependency on imported files.
* Sema: Properly add Decl dependencies for all identifier and namespace
lookups.
* After semantic analysis for a Decl, if it is still marked as
`in_progress`, change it to `dependency_failure` because if the Decl
itself failed, it would have already been changed during the call to
add the compile error.
In `Module.semaDecl`, the source location being used was double-relative
to the `Decl`, causing a crash when trying to compute byte offset for
the compile error.
Change the source location to node_offset = 0 since the scope being used
makes the source location relative to the Decl, which already has the
source node index populated.
Before this change, the attempt to save the most recent successful ZIR
code only worked in some cases; this reworks the code to be more robust,
thereby fixing a crash when running the stage2 "enums" CBE test cases.
Conflicts:
* lib/std/os/linux.zig
* lib/std/os/windows/bits.zig
* src/Module.zig
* src/Sema.zig
* test/stage2/test.zig
Mainly I wanted Jakub's new macOS code for respecting stack size, since
we now depend on it for debug builds able to pass one of the test cases
for recursive comptime function calls with `@setEvalBranchQuota`.
The conflicts were all trivial.
* File stores `root_decl: Decl` instead of `namespace: *Namespace`.
This maps more cleanly to the actual ownership, since the `File` does
own the root decl, but it does not directly own the `Namespace`.
* `semaFile` completes the creation of the `Decl` even when semantic
analysis fails. The `analysis` field of the `Decl` will contain the
results of semantic analysis. This prevents cleaning up of memory
still referenced by other Decl objects.
* `semaDecl` sets `Struct.zir_index` of the root struct decl, which
fixes use of undefined value in case the first update contained a ZIR
compile error.
* Compilation: iteration over the deletion_set only tries to delete the
first one, relying on Decl destroy to remove itself from the deletion
set.
* link: `freeDecl` now has to handle the possibility of freeing a Decl
that was never called with `allocateDeclIndexes`.
* `deleteDecl` recursively iterates over a Decl's Namespace sub-Decl
objects and calls `deleteDecl` on them.
- Prevents Decl objects from being destroyed when they are still in
`deletion_set`.
* Sema: fix cleanup of anonymous Decl objects when an error occurs
during semantic analysis.
* tests: update test cases for fully qualified names
This avoids causing false positive compile errors when, for example, a
file had ZIR errors, and then code tried to look up a public decl from
the failed file.
* `Module.File` now retains the most recent successful ZIR in the event
that an update causes a ZIR compile error. This way Zig does not
throw out useful semantic analysis results when an update temporarily
introduces a ZIR compile error.
* Semantic analysis of a File now unconditionally creates a Decl object
for the File. The Decl object is marked as `file_failed` in case of
File-level compile errors. This allows detecting of the File being
outdated, and dependency tracking just like any other Decl.
Decl objects need to know whether they are the owner of the Type/Value
associated with them, in order to decide whether to destroy the
associated Namespace, Fn, or Var when cleaning up.
This commit takes advantage of the new "NameStrategy" that is exposed
in the ZIR in order to name Decls after their parent when asked to. This
makes the next failing test case pass.
which can be either parent, func, or anon. Here's the enum reproduced in
the commit message for convenience:
```zig
pub const NameStrategy = enum(u2) {
/// Use the same name as the parent declaration name.
/// e.g. `const Foo = struct {...};`.
parent,
/// Use the name of the currently executing comptime function call,
/// with the current parameters. e.g. `ArrayList(i32)`.
func,
/// Create an anonymous name for this declaration.
/// Like this: "ParentDeclName_struct_69"
anon,
};
```
With this information in the ZIR, a future commit can improve the
names of structs, unions, enums, and opaques.
In order to accomplish this, the following ZIR instruction forms were
removed and replaced with Extended op codes:
* struct_decl
* struct_decl_packed
* struct_decl_extern
* union_decl
* union_decl_packed
* union_decl_extern
* enum_decl
* enum_decl_nonexhaustive
By being extended opcodes, one more u32 is needed, however we more than
make up for it by repurposing the 16 "small" bits to provide shorter
encodings for when decls_len == 0, fields_len == 0, a source node is not
provided, etc. There tends to be no downside, and in fact sometimes
upsides, to using an extended op code when there is a need for flag
bits, which is the case for all three of these. Likewise, the container
layout can be encoded in these bits rather than into the opcode.
The following 4 ZIR instructions were added, netting a total of 4 freed
up ZIR enum tags for future use:
* opaque_decl_anon
* opaque_decl_func
* error_set_decl_anon
* error_set_decl_func
This is so that opaques and error sets can have the same name hint as
structs, enums, and unions.
`std.builtin.ContainerLayout` gets an explicit integer tag type so that
it can be used inside packed structs.
This commit also makes `Module.Namespace` use a separate set for
anonymous decls, thus allowing anonymous decls to share the same
`Decl.name` as their owner `Decl` objects.
* no more global atomic variable for anonymous decl indexes. Instead
the Namespace decl table is used to figure out anonymous decl names.
- This prepares for better multi-threaded semantic analysis in the
future, with no contention on this global atomic integer.
* implement fully qualified names for namespaces.
Some of the reworkings in this branch put us over the limit, on Linux,
where the kernel disregards the fact that we ask for 16 MiB in the ELF
file. So we ask for more stack space in `main`.
* Sema: implement global variables
- Improved global constants to stop needlessly creating a Var
structure; they can just store the value directly.
- This required making memory management a bit more sophisticated to
detect when a Decl owns the Namespace associated with it, for the
purposes of deinitialization.
* Decl.name and Namespace decl table keys no longer directly
reference ZIR; instead they have heap-duped names, so that deleted
decls, which no longer have any ZIR to reference for their names, can
be removed from the parent Namespace table.
- In the future I would like to explore going a different direction
with this, where the strings would still point to the ZIR however
they would be removed from their owner Namespace objects during the
update detection. The design principle here is that the existence
of incremental compilation as a feature should not incur any cost
for the use case when it is not used. In this example Decl names
could simply point to ZIR string table memory, and it is only
because of incremental compilation that we duplicate their names.
* AstGen: implement threadlocal variables
* CLI: call cleanExit after building a compilation so that in release
modes we don't bother freeing memory or closing file descriptors,
allowing the OS to do it more efficiently.
* Avoid calling `freeDecl` in the linker for unreferenced Decl objects.
* Fix CBE test case expecting the compile error to point to the wrong
column.
After this commit, `pub export fn main() c_int { ... }` will be
correctly detected as the intended entry point, and therefore start code
will not try to export its own conflicting `main` function.
* Implement basic union support
- lots of stuff is still TODO, including runtime field access
- also TODO: resolving the union tag type
- comptime field access is implemented
* DRY up some code by using the `Zir.DeclIterator` for skipping over
decls in structs and unions.
* Start to clean up Sema with regards to calling `.value()` to find out
a const value. Instead, Sema code should call one of these two:
- `resolvePossiblyUndefinedValue` (followed by logic dealing with
undefined values)
- `resolveDefinedValue` (a compile error will be emitted if the value
is undefined)
* An exported function with an unspecified calling convention gets the
C calling convention.
* Implement comptime field access for structs.
* Add another implementation of "type has one possible value" in Sema.
This is a bit unfortunate since the logic is duplicated, but the one
in Type asserts that the types are resolved already, and is
appropriate to call from codegen, while the one in Sema performs
type resolution if necessary, reporting any compile errors that occur
in the process.
during an incremental update change detection,
the function call to get the old contents hash took place
after mangling the old ZIR index, making it access the wrong
array index.
Now that they are lazy, they need to get analyzed in the correct
context, when requested.
This commit also hooks up std.builtin type values being resolved
properly. This is needed, for example, with the `@export` builtin
function, which occurs in start.zig, for `std.builtin.ExportOptions`.
The ZIR code uses the special `Ref.export_options` value, and semantic
analysis has to map this to the corresponding type from `std.builtin`.
AstGen is now completely independent from the rest of the compiler. It
ingests an AST tree and produces ZIR code as the output, without
depending on any of the glue code of the compiler.
In the byteOffset function, compile errors may need to compute the AST
from source bytes in order to resolve source locations. Previously there
were a few lines trying to access the AST before it was loaded. Trivial
fix, just move load the tree at the beginning.
This allows Sema to namespace them separately from function decls with
the same name. Ran into this in std.math.order conflicting with a test
with the same name.
instead of node indexes.
* AstGen: dbg_stmt instructions now have line and column indexes,
relative to the parent declaration. This allows codegen to emit debug
info without having the source bytes, tokens, or AST nodes loaded
in memory.
* ZIR: each decl has the absolute line number. This allows computing
line numbers from offsets without consulting source code bytes.
Memory management: creating a function definition does not prematurely
set the Decl arena. Instead the function is allocated with the general
purpose allocator.
Codegen no longer looks at source code bytes for any reason. They can
remain unloaded from disk.
* AstGen: LocalVal and LocalPtr use string table indexes for their
names. This is more efficient because local variable declarations do
need to include the variable names so that semantic analysis can emit
a compile error if a declaration is shadowed. So we take advantage of
this fact by comparing string table indexes when resolving names.
* The arg ZIR instructions are needed for the above reasoning, as well
as to emit equivalent AIR instructions for debug info.
Now that we have these arg instructions, get rid of the special
`Zir.Inst.Ref` range for parameters. ZIR instructions now refer
to the arg instructions for parameters.
* Move identAsString and strLitAsString from Module.GenZir to AstGen
where they belong.