Commit graph

459 commits

Author SHA1 Message Date
Andrew Kelley
568f333681 astgen: improve the ensure_unused_result elision 2021-03-22 18:57:46 -07:00
Andrew Kelley
2f391df2a7 stage2: Sema improvements and boolean logic astgen
* add `Module.setBlockBody` and related functions
 * redo astgen for `and` and `or` to use fewer ZIR instructions and
   require less processing for comptime known values
 * Sema: rework `analyzeBody` function. See the new doc comments in this
   commit. Divides ZIR instructions up into 3 categories:
   - always noreturn
   - never noreturn
   - sometimes noreturn
2021-03-22 17:29:56 -07:00
Isaac Freund
9f0b9b8da1
stage2: remove all async related code
The current plan is to avoid using async and related features in the
stage2 compiler so that we can bootstrap before implementing them.

Having this untested and incomplete code in the codebase increases
friction while working on stage2, in particular when preforming
larger refactors such as the current zir memory layout rework.

Therefore remove all async related code, leaving only error messages
in astgen.
2021-03-23 00:23:41 +01:00
Dimenus
240b15381d fix calculation in ensureCapacity 2021-03-22 11:58:44 -07:00
Isaac Freund
8111453cc1
astgen: implement array types 2021-03-22 14:54:13 +01:00
Andrew Kelley
5769c963e0 Sema: implement arithmetic 2021-03-21 19:23:12 -07:00
Isaac Freund
72bcdb639f
astgen: implement bool_and/bool_or 2021-03-22 00:51:25 +01:00
Isaac Freund
310a44d5be zir: add negate/negate_wrap, implement astgen
These were previously implemented as a sub/sub_wrap instruction with a
lhs of 0. Making this separate instructions however allows us to save
some memory as there is no need to store a lhs.
2021-03-21 20:32:39 +01:00
Andrew Kelley
7598a00f34 stage2: fix memory management of ZIR code
* free Module.Fn ZIR code when destroying the owner Decl
 * unreachable_safe and unreachable_unsafe are collapsed into one ZIR
   instruction with a safety flag.
 * astgen: emit an unreachable instruction for unreachable literals
 * don't forget to call deinit on ZIR code
 * astgen: implement some builtin functions
2021-03-20 22:40:08 -07:00
Andrew Kelley
8bad5dfa72 astgen: implement inline assembly 2021-03-20 21:48:35 -07:00
Andrew Kelley
50010447bd astgen: implement function calls 2021-03-20 17:09:06 -07:00
Andrew Kelley
56677f2f2d astgen: support blocks
We are now passing this test:

```zig
export fn _start() noreturn {}
```

```
test.zig:1:30: error: expected noreturn, found void
```

I ran into an issue where we get an integer overflow trying to compute
node index offsets from the containing Decl. The problem is that the
parser adds the Decl node after adding the child nodes. For some things,
it is easy to reserve the node index and then set it later, however, for
this case, it is not a trivial code change, because depending on tokens
after parsing the decl determines whether we want to add a new node or
not.

Possible strategies here:

1. Rework the parser code to make sure that Decl nodes are before
   children nodes in the AST node array.

2. Use signed integers for Decl node offsets.

3. Just flip the order of subtraction and addition. Expect Decl Node
   index to be greater than children Node indexes.

I opted for (3) because it seems like the simplest thing to do. We'll
want to unify the logic for computing the offsets though because if the
logic gets repeated, it will probably get repeated wrong.
2021-03-19 23:15:18 -07:00
Andrew Kelley
937c43ddf1 stage2: first pass at repairing ZIR printing 2021-03-19 19:33:11 -07:00
Andrew Kelley
0357cd8653 Sema: allocate inst_map with arena where appropriate 2021-03-19 15:31:50 -07:00
Andrew Kelley
81a935aef8 stage2: fix some math oopsies and typos 2021-03-19 15:19:47 -07:00
Andrew Kelley
132df14ee1 stage2: fix export source locations not being relative to Decl 2021-03-19 14:59:46 -07:00
jacob gw
c50397c268 llvm backend: use new srcloc
this allows to compile with ninja
2021-03-19 14:46:37 -07:00
jacob gw
e9810d9e79 zir-memory-layout: astgen: fill in identifier 2021-03-19 14:43:08 -07:00
Andrew Kelley
bd2154da3d stage2: the code is compiling again
(with a lot of things commented out)
2021-03-18 22:48:28 -07:00
Andrew Kelley
b2682237db stage2: get Module and Sema compiling again
There are some `@panic("TODO")` in there but I'm trying to get the
branch to the point where collaborators can jump in.

Next is to repair the seam between LazySrcLoc and codegen's expected
absolute file offsets.
2021-03-18 22:19:28 -07:00
Andrew Kelley
66245ac834 stage2: Module and Sema are compiling again
Next up is reworking the seam between the LazySrcLoc emitted by Sema
and the byte offsets currently expected by codegen.

And then the big one: updating astgen.zig to use the new memory layout.
2021-03-17 22:54:56 -07:00
Andrew Kelley
38b3d4b00a stage2: work through some compile errors in Module and Sema 2021-03-17 00:56:08 -07:00
Andrew Kelley
099af0e008 stage2: rename zir_sema.zig to Sema.zig 2021-03-16 00:04:17 -07:00
Andrew Kelley
aef3e534f5 stage2: *WIP*: rework ZIR memory layout; overhaul source locations
The memory layout for ZIR instructions is completely reworked. See
zir.zig for those changes. Some new types:

 * `zir.Code`: a "finished" set of ZIR instructions. Instead of allocating
   each instruction independently, there is now a Tag and 8 bytes of
   data available for all ZIR instructions. Small instructions fit
   within these 8 bytes; larger ones use 4 bytes for an index into
   `extra`. There is also `string_bytes` so that we can have 4 byte
   references to strings. `zir.Inst.Tag` describes how to interpret
   those 8 bytes of data.
   - This is shared by all `Block` scopes.

 * `Module.WipZirCode`: represents an in-progress `zir.Code`. In this
   structure, the arrays are mutable, and get resized as we add/delete
   things. There is extra state to keep track of things. This struct is
   stored on the stack. Once it is finished, it produces an immutable
   `zir.Code`, which will remain on the heap for the duration of a
   function's existence.
   - This is shared by all `GenZir` scopes.

 * `Sema`: represents in-progress semantic analysis of a `zir.Code`.
   This data is stored on the stack and is shared among all `Block`
   scopes. It is now the main "self" argument to everything in the file
   that was previously named `zir_sema.zig`.
   Additionally, I moved some logic that was in `Module` into here.

`Module.Fn` now stores its parameter names inside the `zir.Code`,
instead of inside ZIR instructions. When the TZIR memory layout
reworking time comes, codegen will be able to reference this data
directly instead of duplicating it.

astgen.zig is (so far) almost entirely untouched, but nearly all of it
will need to be reworked to adhere to this new memory layout structure.

I have no benchmarks to report yet, as I am still working through
compile errors and fixing various things that I broke in this branch.

Overhaul of Source Locations:

Previously we used `usize` everywhere to mean byte offset, but sometimes
also mean other stuff. This was error prone and also made us do
unnecessary work, and store unnecessary bytes in memory.

Now there are more types involved into source locations, and more ways
to describe a source location.

 * AllErrors.Message: embrace the assumption that files always have less
   than 2 << 32 bytes.
 * SrcLoc gets more complicated, to model more complicated source
   locations.
 * Introduce LazySrcLoc, which can model interesting source locations
   with very little stored state. Useful for avoiding doing unnecessary
   work when no compile errors occur.

Also, previously, we had `src: usize` on every ZIR instruction. This is
no longer the case. Each instruction now determines whether it even cares
about source location, and if so, how that source location is stored.
This requires more careful work inside `Sema`, but it results in fewer
bytes stored on the heap, without compromising accuracy and power of
compile error messages.

Miscellaneous:

 * std.zig: string literals have more helpful result values for
   reporting errors. There is now a lower level API and a higher level
   API.
   - side note: I noticed that the string literal logic needs some love.
     There is some unnecessarily hacky code there.
 * cut & pasted some TZIR logic that was in zir.zig to ir.zig. This
   probably broke stuff and needs to get fixed.
 * Removed type/Enum.zig, type/Union.zig, and type/Struct.zig. I don't
   think this quite how this code will be organized. Need some more
   careful planning about how to implement structs, unions, enums. They
   need to be independent Decls, just like a top level function.
2021-03-16 00:03:22 -07:00
Veikka Tuominen
8c6e7fb2c7
stage2: implement var args 2021-03-06 15:55:29 +02:00
Veikka Tuominen
17e6e09285
stage2: astgen async 2021-03-06 15:01:25 +02:00
jacob gw
2ebeb0dbf3 stage2: remove error number from error set map
This saves memory since it is already stored in module
as well as allowing for better threading.
Part 2 of what is outlined in #8079.
2021-03-03 11:49:54 -08:00
jacob gw
58b14d01ae stage2: remove value field from error
This saves memory and from what I have heard allows threading
to be easier.
2021-02-28 22:01:13 +02:00
g-w1
153c97ac9e improve stage2 to allow catch at comptime:
* add error_union value tag.
* add analyzeIsErr
* add Value.isError
* add TZIR wrap_errunion_payload and wrap_errunion_err for
  wrapping from T -> E!T and E -> E!T
* add anlyzeInstUnwrapErrCode and analyzeInstUnwrapErr
* add analyzeInstEnsureErrPayloadVoid:
* add wrapErrorUnion
* add comptime error comparison for tests
* tests!
2021-02-25 16:41:16 -08:00
Andrew Kelley
449f4de382 zig fmt src/ 2021-02-24 21:54:23 -07:00
Andrew Kelley
5b597a16c6 stage2: fix not setting up ZIR arg instruction correctly
This is a regression from when I briefly flirted with changing how arg
ZIR instructions work in this branch, and then failed to revert it
correctly.
2021-02-19 20:57:06 -07:00
Andrew Kelley
70761d7c52 stage2: remove incorrect newlines from log statements 2021-02-19 20:27:06 -07:00
Andrew Kelley
8fee41b1d5 stage2: AST: clean up parse errors
* struct instead of tagged union
 * delete dead code
 * simplify parser code
 * remove unnecessary metaprogramming
2021-02-19 18:04:52 -07:00
Andrew Kelley
9010bd8aec stage2: astgen: fix most of the remaining compile errors
more progress on converting astgen to the new AST memory layout.
only a few code paths left to update.
2021-02-18 20:09:29 -07:00
Andrew Kelley
29daf10639 stage2: fix a couple more compilation errors 2021-02-17 22:34:06 -07:00
Andrew Kelley
5a2620fcca stage2: fix some of the compilation errors in this branch 2021-02-17 22:22:10 -07:00
Andrew Kelley
c66481f9bc astgen: finish updating expressions to new mem layout
Now all that is left is compile errors and whatever regressions this
branch introduced.
2021-02-17 20:59:21 -07:00
Andrew Kelley
7630a5c566 stage2: more progress towards Module/astgen building with new mem layout 2021-02-12 23:47:17 -07:00
Andrew Kelley
b4e344bcf8 Merge remote-tracking branch 'origin/master' into ast-memory-layout
Conflicts:
 * lib/std/zig/ast.zig
 * lib/std/zig/parse.zig
 * lib/std/zig/parser_test.zig
 * lib/std/zig/render.zig
 * src/Module.zig
 * src/zir.zig

I resolved some of the conflicts by reverting a small portion of
@tadeokondrak's stage2 logic here regarding `callconv(.Inline)`.
It will need to get reworked as part of this branch.
2021-02-11 23:45:40 -07:00
Andrew Kelley
3d0f4b9030 stage2: start reworking Module/astgen for memory layout changes
This commit does not reach any particular milestone, it is
work-in-progress towards getting things to build.

There's a `@panic("TODO")` in translate-c that should be removed when
working on translate-c stuff.
2021-02-11 23:29:55 -07:00
Tadeo Kondrak
7644e9a752
stage2: switch from inline fn to callconv(.Inline) 2021-02-10 20:22:18 -07:00
Jonathan Marler
1480c42806 require specifier for arrayish types 2021-02-09 22:25:52 -08:00
Andrew Kelley
102d954220
Merge pull request #7827 from Snektron/spirv-setup
Stage 2: SPIR-V setup
2021-02-01 12:49:51 -08:00
Veikka Tuominen
106520329e
stage2 cbe: implement switchbr 2021-02-01 08:48:22 +02:00
Andrew Kelley
0f5eda973e stage2: delete astgen for switch expressions
The astgen for switch expressions did not respect the ZIR rules of only
referencing instructions that are in scope:

  %14 = block_comptime_flat({
    %15 = block_comptime_flat({
      %16 = const(TypedValue{ .ty = comptime_int, .val = 1})
    })
    %17 = block_comptime_flat({
      %18 = const(TypedValue{ .ty = comptime_int, .val = 2})
    })
  })
  %19 = block({
    %20 = ref(%5)
    %21 = deref(%20)
    %22 = switchbr(%20, [%15, %17], {
      %15 => {
        %23 = const(TypedValue{ .ty = comptime_int, .val = 1})
        %24 = store(%10, %23)
        %25 = const(TypedValue{ .ty = void, .val = {}})
        %26 = break("label_19", %25)
      },
      %17 => {
        %27 = const(TypedValue{ .ty = comptime_int, .val = 2})
        %28 = store(%10, %27)
        %29 = const(TypedValue{ .ty = void, .val = {}})
        %30 = break("label_19", %29)
      }
    }, {
      %31 = unreachable_safe()
    }, special_prong=else)
  })

In this snippet you can see that the comptime expr referenced %15 and
%17 which are not in scope. There also was no test coverage for runtime
switch expressions.

Switch expressions will have to be re-introduced to follow these rules
and with some test coverage. There is some usable code being deleted in
this commit; it will be useful to reference when re-implementing switch
later.

A few more improvements to do while we're at it:
 * only use .ref result loc on switch target if any prongs obtain the
   payload with |*syntax|
   - this improvement should be done to if, while, and for as well.
   - this will remove the needless ref/deref instructions above
 * remove switchbr and add switch_block, which is both a block and a
   switch branch.
   - similarly we should remove loop and add loop_block.

This commit introduces a "force_comptime" flag into the GenZIR
scope. The main purpose of this will be to choose the "comptime"
variants of certain key zir instructions, such as function calls and
branches. We will be moving away from using the block_comptime_flat
ZIR instruction, and eventually deleting it.

This commit also contains miscellaneous fixes to this branch that bring
it to the state of passing all the tests.
2021-01-31 21:09:22 -07:00
Andrew Kelley
6c8985fcee astgen: rework labeled blocks 2021-01-31 21:09:22 -07:00
Andrew Kelley
588171c30b sema: after block gets peer type resolved, insert type coercions
on the break instruction operands. This involves a new TZIR instruction,
br_block_flat, which represents a break instruction where the operand is
the result of a flat block. See the doc comments on the instructions for
more details.

How it works: when adding break instructions in semantic analysis, the
underlying allocation is slightly padded so that it is the size of a
br_block_flat instruction, which allows the break instruction to later
be converted without removing instructions inside the parent body. The
extra type coercion instructions go into the body of the br_block_flat,
and backends are responsible for dispatching the instruction correctly
(it should map to the same function calls for related instructions).
2021-01-31 21:09:22 -07:00
Andrew Kelley
b7452fe35f stage2: rework astgen result locations
Motivating test case:

```zig
export fn _start() noreturn {
    var x: u64 = 1;
    var y: u32 = 2;
    var thing: u32 = 1;
    const result = if (thing == 1) x else y;
    exit();
}
```

The main idea here is for astgen to output ideal ZIR depending on
whether or not the sub-expressions of a block consume the result
location. Here, neither `x` nor `y` consume the result location of the
conditional expression block, and so the ZIR should communicate the
result of the condbr using break instructions, not with the result
location pointer.

With this commit, this is accomplished:

```
  %22 = alloc_inferred()
  %23 = block({
    %24 = const(TypedValue{ .ty = type, .val = bool})
    %25 = deref(%18)
    %26 = const(TypedValue{ .ty = comptime_int, .val = 1})
    %27 = cmp_eq(%25, %26)
    %28 = as(%24, %27)
    %29 = condbr(%28, {
      %30 = deref(%4)
      < there is no longer a store instruction here >
      %31 = break("label_23", %30)
    }, {
      %32 = deref(%11)
      < there is no longer a store instruction here >
      %33 = break("label_23", %32)
    })
  })
  %34 = store_to_inferred_ptr(%22, %23) <-- the store is only here
  %35 = resolve_inferred_alloc(%22)
```

However if the result location gets consumed, the break instructions
change to break_void, and the result value is communicated only by the
stores, not by the break instructions.

Implementation:

 * The GenZIR scope that conditional branches uses now has an optional
   result location pointer field and a count of how many times the
   result location ended up being an rvalue (not consumed).
 * When rvalue() is called on a result location for a block, it
   increments this counter. After generating the branches of a block,
   astgen for the conditional branch checks this count and if it is 2
   then the store_to_block_ptr instructions are elided and it calls
   rvalue() using the block result (which will account for peer type
   resolution on the break operands).

astgen has many functions disabled until they can be reworked with these
new semantics. That will be done before merging the branch.

There are some new rules for astgen to follow regarding result locations
and what you are allowed/required to do depending on which one is passed
to expr(). See the updated doc comments of ResultLoc for details.

I also changed naming conventions of stuff in this commit, sorry about
that.
2021-01-31 21:09:22 -07:00
Robin Voetter
b2b87b5900 SPIR-V: Linking and codegen setup 2021-01-19 15:28:17 +01:00
Jakub Konka
d5b0a963d1
Merge pull request #7818 from kubkon/macho-more-cleanup
Macho more cleanup
2021-01-19 08:58:51 +01:00