mirrors/zig - "Borealis" Git by INX: Hosted by INX "Xenon".

mirror of https://codeberg.org/ziglang/zig.git synced 2025-12-06 13:54:21 +00:00

Author	SHA1	Message	Date
mlugg	37a9a4e0f1	compiler: refactor `Zcu.File` and path representation This commit makes some big changes to how we track state for Zig source files. In particular, it changes: * How `File` tracks its path on-disk * How AstGen discovers files * How file-level errors are tracked * How `builtin.zig` files and modules are created The original motivation here was to address incremental compilation bugs with the handling of files, such as #22696. To fix this, a few changes are necessary. Just like declarations may become unreferenced on an incremental update, meaning we suppress analysis errors associated with them, it is also possible for all imports of a file to be removed on an incremental update, in which case file-level errors for that file should be suppressed. As such, after AstGen, the compiler must traverse files (starting from analysis roots) and discover the set of "live files" for this update. Additionally, the compiler's previous handling of retryable file errors was not very good; the source location the error was reported as was based only on the first discovered import of that file. This source location also disappeared on future incremental updates. So, as a part of the file traversal above, we also need to figure out the source locations of imports which errors should be reported against. Another observation I made is that the "file exists in multiple modules" error was not implemented in a particularly good way (I get to say that because I wrote it!). It was subject to races, where the order in which different imports of a file were discovered affects both how errors are printed, and which module the file is arbitrarily assigned, with the latter in turn affecting which other files are considered for import. The thing I realised here is that while the AstGen worker pool is running, we cannot know for sure which module(s) a file is in; we could always discover an import later which changes the answer. So, here's how the AstGen workers have changed. We initially ensure that `zcu.import_table` contains the root files for all modules in this Zcu, even if we don't know any imports for them yet. Then, the AstGen workers do not need to be aware of modules. Instead, they simply ignore module imports, and only spin off more workers when they see a by-path import. During AstGen, we can't use module-root-relative paths, since we don't know which modules files are in; but we don't want to unnecessarily use absolute files either, because those are non-portable and can make `error.NameTooLong` more likely. As such, I have introduced a new abstraction, `Compilation.Path`. This type is a way of representing a filesystem path which has a canonical form. The path is represented relative to one of a few special directories: the lib directory, the global cache directory, or the local cache directory. As a fallback, we use absolute (or cwd-relative on WASI) paths. This is kind of similar to `std.Build.Cache.Path` with a pre-defined list of possible `std.Build.Cache.Directory`, but has stricter canonicalization rules based on path resolution to make sure deduplicating files works properly. A `Compilation.Path` can be trivially converted to a `std.Build.Cache.Path` from a `Compilation`, but is smaller, has a canonical form, and has a digest which will be consistent across different compiler processes with the same lib and cache directories (important when we serialize incremental compilation state in the future). `Zcu.File` and `Zcu.EmbedFile` both contain a `Compilation.Path`, which is used to access the file on-disk; module-relative sub paths are used quite rarely (`EmbedFile` doesn't even have one now for simplicity). After the AstGen workers all complete, we know that any file which might be imported is definitely in `import_table` and up-to-date. So, we perform a single-threaded graph traversal; similar to what `resolveReferences` plays for `AnalUnit`s, but for files instead. We figure out which files are alive, and which module each file is in. If a file turns out to be in multiple modules, we set a field on `Zcu` to indicate this error. If a file is in a different module to a prior update, we set a flag instructing `updateZirRefs` to invalidate all dependencies on the file. This traversal also discovers "import errors"; these are errors associated with a specific `@import`. With Zig's current design, there is only one possible error here: "import outside of module root". This must be identified during this traversal instead of during AstGen, because it depends on which module the file is in. I tried also representing "module not found" errors in this same way, but it turns out to be much more useful to report those in Sema, because of use cases like optional dependencies where a module import is behind a comptime-known build option. For simplicity, `failed_files` now just maps to `?[]u8`, since the source location is always the whole file. In fact, this allows removing `LazySrcLoc.Offset.entire_file` completely, slightly simplifying some error reporting logic. File-level errors are now directly built in the `std.zig.ErrorBundle.Wip`. If the payload is not `null`, it is the message for a retryable error (i.e. an error loading the source file), and will be reported with a "file imported here" note pointing to the import site discovered during the single-threaded file traversal. The last piece of fallout here is how `Builtin` works. Rather than constructing "builtin" modules when creating `Package.Module`s, they are now constructed on-the-fly by `Zcu`. The map `Zcu.builtin_modules` maps from digests to `Package.Module`s. These digests are abstract hashes of the `Builtin` value; i.e. all of the options which are placed into "builtin.zig". During the file traversal, we populate `builtin_modules` as needed, so that when we see this imports in Sema, we just grab the relevant entry from this map. This eliminates a bunch of awkward state tracking during construction of the module graph. It's also now clearer exactly what options the builtin module has, since previously it inherited some options arbitrarily from the first-created module with that "builtin" module! The user-visible effects of this commit are: retryable file errors are now consistently reported against the whole file, with a note pointing to a live import of that file * some theoretical bugs where imports are wrongly considered distinct (when the import path moves out of the cwd and then back in) are fixed * some consistency issues with how file-level errors are reported are fixed; these errors will now always be printed in the same order regardless of how the AstGen pass assigns file indices * incremental updates do not print retryable file errors differently between updates or depending on file structure/contents * incremental updates support files changing modules * incremental updates support files becoming unreferenced Resolves: #22696	2025-05-18 17:37:02 +01:00
mlugg	b6a1fdd3fa	tests: disable failing tests These were previously incremental tests, so weren't running. They didn't need to be incremental. They worked under the old runner because of how it directly integrated with the compiler so tracked error messages differently.	2025-02-23 00:52:50 +00:00
mlugg	a3b3a33d7a	cases: remove old incremental case system We now run incremental tests with `tools/incr-check.zig` (with the actual cases being in `test/incremental/`).	2025-02-23 00:52:50 +00:00
Will Lillis	cf059ee087	AstGen: improve error for invalid bytes in strings and comments	2025-02-05 11:10:11 +02:00
mlugg	e9bd2d45d4	Sema: rewrite semantic analysis of function calls This rewrite improves some error messages, hugely simplifies the logic, and fixes several bugs. One of these bugs is technically a new rule which Andrew and I agreed on: if a parameter has a comptime-only type but is not declared `comptime`, then the corresponding call argument should not be evaluated at comptime; only resolved. Implementing this required changing how function types work a little, which in turn required allowing a new kind of function coercion for some generic use cases: function coercions are now allowed to implicitly remove `comptime` annotations from parameters with comptime-only types. This is okay because removing the annotation affects only the call site. Resolves: #22262	2025-01-09 06:46:47 +00:00
mlugg	9ff80d7950	cases: update to new compile error wordings	2024-12-31 09:55:04 +00:00
mlugg	605f2a0978	cases: update for new error wording, add coverage for field/decl name conflict	2024-08-29 23:43:52 +01:00
Andrew Kelley	c2b8afcac9	tokenizer: tabs and carriage returns spec conformance	2024-07-31 16:57:42 -07:00
Andrew Kelley	377e8579f9	std.zig.tokenizer: simplify I pointed a fuzzer at the tokenizer and it crashed immediately. Upon inspection, I was dissatisfied with the implementation. This commit removes several mechanisms: * Removes the "invalid byte" compile error note. * Dramatically simplifies tokenizer recovery by making recovery always occur at newlines, and never otherwise. * Removes UTF-8 validation. * Moves some character validation logic to `std.zig.parseCharLiteral`. Removing UTF-8 validation is a regression of #663, however, the existing implementation was already buggy. When adding this functionality back, it must be fuzz-tested while checking the property that it matches an independent Unicode validation implementation on the same file. While we're at it, fuzzing should check the other properties of that proposal, such as no ASCII control characters existing inside the source code. Other changes included in this commit: * Deprecate `std.unicode.utf8Decode` and its WTF-8 counterpart. This function has an awkward API that is too easy to misuse. * Make `utf8Decode2` and friends use arrays as parameters, eliminating a runtime assertion in favor of using the type system. After this commit, the crash found by fuzzing, which was "\x07\xd5\x80\xc3=o\xda\|a\xfc{\x9a\xec\x91\xdf\x0f\\\x1a^\xbe;\x8c\xbf\xee\xea" no longer causes a crash. However, I did not feel the need to add this test case because the simplified logic eradicates most crashes of this nature.	2024-07-31 16:57:42 -07:00
gooncreeper	c50f300387	Tokenizer bug fixes and improvements Fixes many error messages corresponding to invalid bytes displaying the wrong byte. Additionaly improves handling of UTF-8 in some places.	2024-07-15 11:31:19 +03:00
Krzysztof Wolicki	45c77931c2	Change deprecated b.host to b.graph.host in tests and Zig's build.zig	2024-06-13 10:49:06 -04:00
Andrew Kelley	142471fcc4	zig build system: change target, compilation, and module APIs Introduce the concept of "target query" and "resolved target". A target query is what the user specifies, with some things left to default. A resolved target has the default things discovered and populated. In the future, std.zig.CrossTarget will be rename to std.Target.Query. Introduces `std.Build.resolveTargetQuery` to get from one to the other. The concept of `main_mod_path` is gone, no longer supported. You have to put the root source file at the module root now. * remove deprecated API * update build.zig for the breaking API changes in this branch * move std.Build.Step.Compile.BuildId to std.zig.BuildId * add more options to std.Build.ExecutableOptions, std.Build.ObjectOptions, std.Build.SharedLibraryOptions, std.Build.StaticLibraryOptions, and std.Build.TestOptions. * remove `std.Build.constructCMacro`. There is no use for this API. * deprecate `std.Build.Step.Compile.defineCMacro`. Instead, `std.Build.Module.addCMacro` is provided. - remove `std.Build.Step.Compile.defineCMacroRaw`. * deprecate `std.Build.Step.Compile.linkFrameworkNeeded` - use `std.Build.Module.linkFramework` * deprecate `std.Build.Step.Compile.linkFrameworkWeak` - use `std.Build.Module.linkFramework` * move more logic into `std.Build.Module` * allow `target` and `optimize` to be `null` when creating a Module. Along with other fields, those unspecified options will be inherited from parent `Module` when inserted into an import table. * the `target` field of `addExecutable` is now required. pass `b.host` to get the host target.	2024-01-01 17:51:18 -07:00
Andrew Kelley	5eb5d523b5	give modules friendly names for error reporting	2023-10-08 20:58:04 -07:00
mlugg	09a57583a4	compiler: preserve result type information through address-of operator This commit introduces the new `ref_coerced_ty` result type into AstGen. This represents a expression which we want to treat as an lvalue, and the pointer will be coerced to a given type. This change gives known result types to many expressions, in particular struct and array initializations. This allows certain casts to work which previously required explicitly specifying types via `@as`. It also eliminates our dependence on anonymous struct types for expressions of the form `&.{ ... }` - this paves the way for #16865, and also results in less Sema magic happening for such initializations, also leading to potentially better runtime code. As part of these changes, this commit also implements #17194 by disallowing RLS on explicitly-typed struct and array initializations. Apologies for linking these changes - it seemed rather pointless to try and separate them, since they both make big changes to struct and array initializations in AstGen. The rationale for this change can be found in the proposal - in essence, performing RLS whilst maintaining the semantics of the intermediary type is a very difficult problem to solve. This allowed the problematic `coerce_result_ptr` ZIR instruction to be completely eliminated, which in turn also simplified the logic for inferred allocations in Sema - thanks to this, we almost break even on line count! In doing this, the ZIR instructions surrounding these initializations have been restructured - some have been added and removed, and others renamed for clarity (and their semantics changed slightly). In order to optimize ZIR tag count, the `struct_init_anon_ref` and `array_init_anon_ref` instructions have been removed in favour of using `ref` on a standard anonymous value initialization, since these instructions are now virtually never used. Lastly, it's worth noting that this commit introduces a slightly strange source of generic poison types: in the expression `@as(*anyopaque, &x)`, the sub-expression `x` has a generic poison result type, despite no generic code being involved. This turns out to be a logical choice, because we don't know the result type for `x`, and the generic poison type represents precisely this case, providing the semantics we need. Resolves: #16512 Resolves: #17194	2023-09-23 22:01:08 +01:00
mlugg	2209813bae	cases: modify error wording to match new errors The changes to result locations and generic calls has caused mild changes to some compile errors. Some are slightly better, some slightly worse, but none of the changes are major.	2023-08-10 10:00:37 +01:00
Jacob Young	736df27663	Sema: use the correct decl for generic argument source locations Closes #16746	2023-08-09 10:09:01 -04:00
mlugg	3a25f6a22e	Port some stage1 test cases to stage2 There are now very few stage1 cases remaining: * `cases/compile_errors/stage1/obj/` currently don't work correctly on stage2. There are 6 of these, and most of them are probably fairly simple to fix. `cases/compile_errors/async/` and all remaining `safety/` depend on async; see #6025. Resolves: #14849	2023-03-20 19:55:50 -04:00
Andrew Kelley	29cfd47d65	re-enable test-cases and get them all passing Instead of using `zig test` to build a special version of the compiler that runs all the test-cases, the zig build system is now used as much as possible - all with the basic steps found in the standard library. For incremental compilation tests (the ones that look like foo.0.zig, foo.1.zig, foo.2.zig, etc.), a special version of the compiler is compiled into a utility executable called "check-case" which checks exactly one sequence of incremental updates in an independent subprocess. Previously, all incremental and non-incremental test cases were done in the same test runner process. The compile error checking code is now simpler, but also a bit rudimentary, and so it additionally makes sure that the actual compile errors do not include extra messages, and it makes sure that the actual compile errors output in the same order as expected. It is also based on the "ends-with" property of each line rather than the previous logic, which frankly I didn't want to touch with a ten-meter pole. The compile error test cases have been updated to pass in light of these differences. Previously, 'error' mode with 0 compile errors was used to shoehorn in a different kind of test-case - one that only checks if a piece of code compiles without errors. Now there is a 'compile' mode of test-cases, and 'error' must be only used when there are greater than 0 errors. link test cases are updated to omit the target object format argument when calling checkObject since that is no longer needed. The test/stage2 directory is removed; the 2 files within are moved to be directly in the test/ directory.	2023-03-15 10:48:14 -07:00
mlugg	f94cbab3ac	Add test coverage for some module structures	2023-02-21 02:05:36 +00:00
Tom Read Cutting	346ec15c50	Correctly handle carriage return characters according to the spec (#12661 ) * Scan from line start when finding tag in tokenizer This resolves a crash that can occur for invalid bytes like carriage returns that are valid characters when not parsed from within literals. There are potentially other edge cases this could resolve as well, as the calling code for this function didn't account for any potential 'pending_invalid_tokens' that could be queued up by the tokenizer from within another state. * Fix carriage return crash in multiline string Follow the guidance of #38: > However CR directly before NL is interpreted as only a newline and not part of the multiline string. zig fmt will delete the CR. Zig fmt already had code for deleting carriage returns, but would still crash - now it no longer does so. Carriage returns encountered before line-feeds are now appropriately removed on program compilation as well. * Only accept carriage returns before line feeds Previous commit was much less strict about this, this more closely matches the desired spec of only allow CR characters in a CRLF pair, but not otherwise. * Fix CR being rejected when used as whitespace Missed this comment from ziglang/zig-spec#83: > CR used as whitespace, whether directly preceding NL or stray, is still unambiguously whitespace. It is accepted by the grammar and replaced by the canonical whitespace by zig fmt. * Add tests for carriage return handling	2023-02-19 14:14:03 +02:00
Veikka Tuominen	8eea73fb92	add tests for tuple declarations	2022-11-23 22:16:31 +02:00
Veikka Tuominen	c3b85e4e2f	Sema: further enhance explanation of why expr is evaluated at comptime	2022-10-28 13:31:16 +03:00
r00ster91	51d9db8569	fix(text): hyphenate "comptime" adjectives	2022-10-05 21:19:30 +02:00
r00ster91	654e0b6679	fix(text): hyphenation and other fixes	2022-10-05 21:19:10 +02:00
John Schmidt	b6bda5183e	sema: load the correct AST in failWithInvalidComptimeFieldStore The container we want to get the fields from might not be declared in the same file as the block we are analyzing, so we should get the AST from the decl's file instead.	2022-09-26 08:56:34 +02:00
Veikka Tuominen	b2f02a820f	Sema: check for astgen failures in `semaStructFields` The struct might be a top level struct in which case it might not have Zir. Closes #12548	2022-08-22 11:16:36 +03:00
Veikka Tuominen	e8102d8738	Sema: add note about function call being comptime because of comptime only return type	2022-08-21 12:24:48 +03:00
Veikka Tuominen	2f34d06d01	Sema: `analyzeInlineCallArg` needs a block for the arg and the param	2022-07-25 22:04:08 +03:00
Veikka Tuominen	1463144fc8	Compilation: point caret in error message at the main token	2022-07-15 15:11:43 +03:00
r00ster91	da75eb0d79	Compilation: indent multiline error messages properly Co-authored-by: Veikka Tuominen <git@vexu.eu>	2022-07-12 00:10:39 +03:00
Jakub Konka	62625d9d95	test: migrate stage1 compile error tests to updated test manifest	2022-04-28 18:35:01 +02:00
Andrew Kelley	243afdcdf5	test harness improvements * `-Dskip-compile-errors` is removed; `-Dskip-stage1` is added. * Use `std.testing.allocator` instead of a new instance of GPA. - Fix the memory leaks this revealed. * Show the file name when it is not parsed correctly such as when the manifest is missing. - Better error messages when test files are not parsed correctly. * Ignore unknown files such as swap files. * Move logic from declarative file to the test harness implementation. * Move stage1 tests to stage2 tests where appropriate.	2022-03-31 15:10:31 -07:00
Andrew Kelley	47dfaf47b8	stage2: test compile errors independently Until we land some incremental compilation bug fixes, this prevents CI failures when running the compile errors test cases.	2022-03-30 11:22:27 -07:00
Andrew Kelley	9aa431cba3	test harness: include case names for compile errors in the progress nodes	2022-03-29 12:01:45 -07:00
Cody Tapscott	0568b45779	Move existing compile errors to independent files Some cases had to stay behind, either because they required complex case configuration that we don't support in independent files yet, or because they have associated comments which we don't want to lose track of. To make sure I didn't drop any tests in the process, I logged all obj/test/exe test cases from a run of "zig build test" and compared before/after this change. All of the test cases match, with two exceptions: - "use of comptime-known undefined function value" was deleted, since it was a duplicate - "slice sentinel mismatch" was renamed to "vector index out of bounds", since it was incorrectly named	2022-03-25 12:27:46 -07:00
Cody Tapscott	7f64f7c925	Add rudimentary compile error test file support This brings two quality-of-life improvements for folks working on compile error test cases: - test cases can be added/changed without re-building Zig - wrapping the source in a multi-line string literal is not necessary I decided to keep things as simple as possible for this initial implementation. The test "manifest" is a contiguous comment block at the end of the test file: 1. The first line is the test case name 2. The second line is a blank comment 2. The following lines are expected errors Here's an example: ```zig const U = union(enum(u2)) { A: u8, B: u8, C: u8, D: u8, E: u8, }; export fn entry() void { _ = U{ .E = 1 }; } // union with too small explicit unsigned tag type // // tmp.zig:1:22: error: specified integer tag type cannot represent every field // tmp.zig:1:22: note: type u2 cannot fit values in range 0...4 ``` The mode of the test (obj/exe/test), as well as the target (stage1/stage2) is determined based on the directory containing the test. We'll probably eventually want to support embedding this information in the test files themselves, similar to the arocc test runner, but that enhancement can be tackled later.	2022-03-25 12:25:43 -07:00
Mitchell Hashimoto	a36f4ee290	stage2: able to slice to sentinel index at comptime The runtime behavior allowed this in both stage1 and stage2, but stage1 fails with index out of bounds during comptime. This behavior makes sense to support, and comptime behavior should match runtime behavior. I implement this fix only in stage2.	2022-03-23 17:08:08 -04:00
Mitchell Hashimoto	91fd0f42c8	stage2: out of bounds error for slicing	2022-03-21 22:10:34 -04:00
Daniel Hooper	911c839e97	add error when binary ops don't have matching whitespace on both sides This change also moves the warning about "&&" from the AstGen into the parser so that the "&&" warning can supersede the whitespace warning.	2022-03-20 12:55:04 +02:00
Robin Voetter	5c3325588e	stage1: make type names more unique	2022-03-19 19:40:46 -04:00
Mitchell Hashimoto	394252c9db	stage2: move duplicate error set check to AstGen	2022-03-16 01:41:22 -04:00
Jonathan Marler	d805adddd6	deprecated TypeInfo in favor of Type Co-authored-by: Veikka Tuominen <git@vexu.eu>	2022-03-08 20:38:12 +02:00
Veikka Tuominen	f8154905e7	stage1: rename TypeInfo.FnArg to Fn.Param	2022-02-23 09:44:36 +02:00
Veikka Tuominen	2f0204aca3	parser: fix "previous field here" pointing to wrong field	2022-02-19 10:15:54 +02:00
Veikka Tuominen	6b65590715	parser: add notes to decl_between_fields error	2022-02-17 22:16:26 +02:00
Veikka Tuominen	92f2767814	parser: add error for missing colon before continue expr If a '(' is found where the continue expression was expected and it is on the same line as the previous token issue an error about missing colon before the continue expression.	2022-02-17 20:51:26 +02:00
Veikka Tuominen	c9dde10f86	stage1: improve error message when casting tuples	2022-02-17 17:39:54 +02:00
Veikka Tuominen	9c36cf92f0	parser: make some errors point to end of previous token For some errors if the found token is not on the same line as the previous token, point to the end of the previous token. This usually results in more helpful errors.	2022-02-17 14:23:35 +02:00
Veikka Tuominen	8a432436ae	update compile error tests	2022-02-13 13:48:20 +02:00
Sebsatian Keller	f5471299d8	stage 1: improve error message if error union is cast to payload (#10770 ) Also: Added special error message for for `?T` to `T` casting	2022-02-09 20:35:53 -05:00

1 2 3 4 5 ...

710 commits