Commit graph

4889 commits

Author SHA1 Message Date
Andrew Kelley
a024aff932 make f80 less hacky; lower as u80 on non-x86
Get rid of `std.math.F80Repr`. Instead of trying to match the memory
layout of f80, we treat it as a value, same as the other floating point
types. The functions `make_f80` and `break_f80` are introduced to
compose an f80 value out of its parts, and the inverse operation.

stage2 LLVM backend: fix pointer to zero length array tripping LLVM
assertion. It now checks for when the element type is a zero-bit type
and lowers such thing the same way that pointers to other zero-bit types
are lowered.

Both stage1 and stage2 LLVM backends are adjusted so that f80 is lowered
as x86_fp80 on x86_64 and i386 architectures, and identical to a u80 on
others. LLVM constants are lowered in a less hacky way now that #10860
is fixed, by using the expression `(exp << 64) | fraction` using llvm
constants.

Sema is improved to handle c_longdouble by recursively handling it
correctly for whatever the float bit width is. In both stage1 and
stage2.
2022-02-12 11:18:23 +01:00
Andrew Kelley
166db1a3ed stage1: fix f80 size and alignment on x86 and arm
* F80Repr extern struct needs no explicit padding; let's match the
   target padding.
 * stage2: fix lowering of f80 constants.
 * stage1: decide ABI size and alignment of f80 based on alignment of
   u64. x86 has alignof u64 equal to 4 but arm has it as 8.
 * stage2: fix Value.floatReadFromMemory to use F80Repr
2022-02-12 11:18:23 +01:00
Andrew Kelley
65c812842d freestanding libc: fix missing functions
In the previous commit I got mixed up and cut-pasted instead of
copy-pasting. In this commit I made c_stage1.zig additionally included
for stage1 and everything else included for both. So moving forward we
move stuff over from c_stage1.zig to c.zig instead of copying.
2022-02-09 19:13:53 -07:00
Andrew Kelley
f2f1c63daf stage2: add log and logf to freestanding libc 2022-02-09 18:52:32 -07:00
Jakub Konka
e588e3873c stage2: export trunc, truncf and truncl 2022-02-09 10:28:48 +01:00
billzez
470c8ca48c
update RwLock to use static initialization (#10838) 2022-02-08 23:35:48 -05:00
Andrew Kelley
5a00e24963 std.Progress: make the API infallible
by handling `error.TimerUnsupported`. In this case, only explicit calls
to refresh() will cause the progress line to be printed.
2022-02-08 17:26:55 -07:00
Luuk de Gram
f50203c836 wasm: update test runner
This updates the test runner for stage2 to emit to stdout with the passed, skipped and failed tests
similar to the LLVM backend.

Another change to this is the start function, as it's now more in line with stage1's.
The stage2 test infrastructure for wasm/wasi has been updated to reflect this as well.
2022-02-08 10:03:29 +01:00
Jan Philipp Hafer
fc59a04061 compiler_rt: add subo
- approach by Hacker's Delight with wrapping subtraction
- performance expected to be similar to addo
- tests with all relevant combinations of min,max with -1,0,+1 and all
  combinations of sequences +-1,2,4..,max
2022-02-08 02:14:29 -05:00
Andrew Kelley
dd49ed1c64
Merge pull request #10813 from marler8997/windowsChildHang
child_process: collectOutputWindows handle broken_pipe from ReadFile
2022-02-07 17:33:26 -05:00
John Schmidt
05cf69209e debug: implement segfault handler for macOS aarch64
Tested on a M1 MacBook Pro, macOS Monterey 12.2.
2022-02-07 16:34:05 -05:00
Arnavion
2a415a033c std.bit_set: add setRangeValue(Range, bool)
For large ranges, this is faster than having the caller call setValue() for
each index in the range. Masks wholly covered by the range can be set to
the new mask value in one go, and the two masks at either end that are
partially covered can each set the covered range of bits in one go.
2022-02-07 14:58:07 -05:00
Andrew Kelley
8a94971980 std: fix i386-openbsd failing to build from source
closes #9705
2022-02-07 12:20:25 -07:00
matu3ba
3db130ff3d
compiler_rt: add addo (#10824)
- approach by Hacker's Delight with wrapping addition
- ca. 1.10x perf over the standard approach on my laptop
- tests with all combinations of min,max with -1,0,+1 and combinations of
  sequences +-1,2,4..,max
2022-02-07 13:27:21 -05:00
Evan Haas
e382e7be2b std: Allow mem.zeroes to work at comptime with extern union
Fixes #10797
2022-02-07 11:34:20 +02:00
boofexxx
435beb4e1d std: fix doc comment typo in os.zig 2022-02-07 09:55:47 +01:00
Jonathan Marler
2cc33367eb fix bug when ReadFile returns synchronously in collectOutputWindows 2022-02-07 01:49:15 -07:00
Cody Tapscott
c1cf158729 Replace argvCmd with std.mem.join 2022-02-06 22:21:46 -07:00
Cody Tapscott
5065830aa0 Avoid depending on child process execution when not supported by host OS
In accordance with the requesting issue (#10750):
- `zig test` skips any tests that it cannot spawn, returning success
- `zig run` and `zig build` exit with failure, reporting the command the cannot be run
- `zig clang`, `zig ar`, etc. already punt directly to the appropriate clang/lld main(), even before this change
- Native `libc` Detection is not supported

Additionally, `exec()` and related Builder functions error at run-time, reporting the command that cannot be run
2022-02-06 22:21:46 -07:00
Andrew Kelley
069dd01ce4
Merge pull request #10740 from ziglang/stage2-float-arithmetic
stage2: add more float arithmetic and f80 support
2022-02-06 22:41:43 -05:00
Andrew Kelley
d4805472c3 compiler_rt: addXf3: add coercion to @clz
We're going to remove the first parameter from this function in the
future. Stage2 already ignores the first parameter. So we put an `@as`
in here to make it work for both.
2022-02-06 20:06:00 -07:00
Marc Tiehuis
53e6c719ef std/math: optimize division with divisors less than a half-limb
This adds a new path which avoids using compiler_rt generated div
udivmod instructions in the case that a divisor is less than half the
max usize value. Two half-limb divisions are performed instead which
ensures that non-emulated division instructions are actually used. This
does not improve the udivmod code which should still be reviewed
independently of this issue.

Notably this improves the performance of the toString implementation of
non-power-of-two bases considerably.

Division performance is improved ~1000% based on some coarse testing.

The following test code is used to provide a rough comparison between
the old vs. new method.

```
const std = @import("std");
const Managed = std.math.big.int.Managed;

const allocator = std.heap.c_allocator;

fn fib(a: *Managed, n: usize) !void {
    var b = try Managed.initSet(allocator, 1);
    defer b.deinit();
    var c = try Managed.init(allocator);
    defer c.deinit();

    var i: usize = 0;
    while (i < n) : (i += 1) {
        try c.add(a.toConst(), b.toConst());

        a.swap(&b);
        b.swap(&c);
    }
}

pub fn main() !void {
    var a = try Managed.initSet(allocator, 0);
    defer a.deinit();

    try fib(&a, 1_000_000);

    // Note: Next two lines (and printed digit count) omitted on no-print version.
    const as = try a.toString(allocator, 10, .lower);
    defer allocator.free(as);

    std.debug.print("fib: digit count: {}, limb count: {}\n", .{ as.len, a.limbs.len });
}
```

```
==> time.no-print <==
limb count: 10849

________________________________________________________
Executed in   10.60 secs    fish           external
   usr time   10.44 secs    0.00 millis   10.44 secs
   sys time    0.02 secs    1.12 millis    0.02 secs

==> time.old <==
fib: digit count: 208988, limb count: 10849

________________________________________________________
Executed in   22.78 secs    fish           external
   usr time   22.43 secs    1.01 millis   22.43 secs
   sys time    0.03 secs    0.13 millis    0.03 secs

==> time.optimized <==
fib: digit count: 208988, limb count: 10849

________________________________________________________
Executed in   11.59 secs    fish           external
   usr time   11.56 secs    1.03 millis   11.56 secs
   sys time    0.03 secs    0.12 millis    0.03 secs
```

Perf data for non-optimized and optimized, verifying no udivmod is
generated by the new code.

```
$ perf report -i perf.data.old --stdio
- Total Lost Samples: 0
-
- Samples: 90K of event 'cycles:u'
- Event count (approx.): 71603695208
-
- Overhead  Command  Shared Object     Symbol
- ........  .......  ................  ...........................................
-
    52.97%  t        t                 [.] compiler_rt.udivmod.udivmod
    45.97%  t        t                 [.] std.math.big.int.Mutable.addCarry
     0.83%  t        t                 [.] main
     0.08%  t        libc-2.33.so      [.] __memmove_avx_unaligned_erms
     0.08%  t        t                 [.] __udivti3
     0.03%  t        [unknown]         [k] 0xffffffff9a0010a7
     0.02%  t        t                 [.] std.math.big.int.Managed.ensureCapacity
     0.01%  t        libc-2.33.so      [.] _int_malloc
     0.00%  t        libc-2.33.so      [.] __malloc_usable_size
     0.00%  t        libc-2.33.so      [.] _int_free
     0.00%  t        t                 [.] 0x0000000000004a80
     0.00%  t        t                 [.] std.heap.CAllocator.resize
     0.00%  t        libc-2.33.so      [.] _mid_memalign
     0.00%  t        libc-2.33.so      [.] sysmalloc
     0.00%  t        libc-2.33.so      [.] __posix_memalign
     0.00%  t        t                 [.] std.heap.CAllocator.alloc
     0.00%  t        ld-2.33.so        [.] do_lookup_x

$ perf report -i perf.data.optimized --stdio
- Total Lost Samples: 0
-
- Samples: 46K of event 'cycles:u'
- Event count (approx.): 36790112336
-
- Overhead  Command  Shared Object     Symbol
- ........  .......  ................  ...........................................
-
    79.98%  t        t                 [.] std.math.big.int.Mutable.addCarry
    15.14%  t        t                 [.] main
     4.58%  t        t                 [.] std.math.big.int.Managed.ensureCapacity
     0.21%  t        libc-2.33.so      [.] __memmove_avx_unaligned_erms
     0.05%  t        [unknown]         [k] 0xffffffff9a0010a7
     0.02%  t        libc-2.33.so      [.] _int_malloc
     0.01%  t        t                 [.] std.heap.CAllocator.alloc
     0.01%  t        libc-2.33.so      [.] __malloc_usable_size
     0.00%  t        libc-2.33.so      [.] systrim.constprop.0
     0.00%  t        libc-2.33.so      [.] _mid_memalign
     0.00%  t        t                 [.] 0x0000000000000c7d
     0.00%  t        libc-2.33.so      [.] malloc
     0.00%  t        ld-2.33.so        [.] check_match
```

Closes #10630.
2022-02-06 21:39:34 -05:00
Jonathan Marler
4fddb591e2 rework to allow ReadFile to complete synchronously 2022-02-06 18:05:21 -07:00
Jonathan Marler
8f830207c4 fix bug I think I found while manually reviewing 2022-02-06 18:05:21 -07:00
Jonathan Marler
53d8a25dab child_process: collectOutputWindows handle broken_pipe from ReadFile
This was found on a user's machine when calling "git" as a child process from msys.  Instead of getting BROKEN_PIPE on GetOverlappedREsult, it would occur on ReadFile which would then cause the function to hang because the async operation was never started.
2022-02-06 18:05:20 -07:00
Johannes Löthberg
6f87f49f3d CLI: remove remainders of --verbose-ast and --verbose-tokenize
These options were removed in 5e63baae8 (CLI: remove --verbose-ast and
--verbose-tokenize, 2021-06-09) but some remainders were left in.

Signed-off-by: Johannes Löthberg <johannes@kyriasis.com>
2022-02-06 01:57:04 -05:00
Andrew Kelley
8dcb1eba60
Merge pull request #10738 from Vexu/f80
Add compiler-rt functions for f80
2022-02-05 20:57:32 -05:00
praschke
f2a82bafae std: allow tests to use cache and setOutputDir 2022-02-05 16:33:57 +02:00
gwenzek
0e1afb4d98
stage2: add support for Nvptx target
sample command:

/home/guw/github/zig/stage2/bin/zig build-obj cuda_kernel.zig -target nvptx64-cuda -O ReleaseSafe
this will create a kernel.ptx

expose PtxKernel call convention from LLVM
kernels are `export fn f() callconv(.PtxKernel)`
2022-02-05 16:33:00 +02:00
rohlem
fbc06f9c91 std.build.TranslateCStep: add C macro support
The string construction code is moved out of std.build.LibExeObjStep
into std.build.constructCMacroArg, to allow reusing it elsewhere.
2022-02-05 03:17:07 -05:00
Veikka Tuominen
7d04ab1f14 std.process: add option to support single quotes to ArgIteratorGeneral 2022-02-05 02:59:13 -05:00
Jan Philipp Hafer
01d48e55a5 compiler_rt: optimize mulo
- use usize to decide if register size is big enough to store
  multiplication result or if division is necessary
- multiplication routine with check of integer bounds
- wrapping multipliation and division routine from Hacker's Delight
2022-02-05 01:35:46 -05:00
Veikka Tuominen
5a7d43df23 stage1: make f80 always size 16, align 16 2022-02-04 22:44:56 +02:00
Veikka Tuominen
6a736f0c8c compiler-rt: add add/sub for f80 2022-02-04 22:38:13 +02:00
Veikka Tuominen
9bbd3ab257 compiler-rt: add comparison functions for f80 2022-02-04 22:22:43 +02:00
Veikka Tuominen
72cef17b1a compiler-rt: add trunc functions for f80 2022-02-04 22:18:44 +02:00
Veikka Tuominen
5c4ef1a64c compiler-rt: add extend functions for f80 2022-02-04 22:16:07 +02:00
Andrew Kelley
449554a730 stage2: remove anytype fields from the language
closes #10705
2022-02-01 19:06:40 -07:00
Andrew Kelley
f4a249325e stage1: avoid anytype fields for type info
prerequisite for #10705
2022-02-01 18:10:19 -07:00
Andrew Kelley
3e99495ed8
Merge pull request #10742 from ziglang/ArrayHashMapEql
std: make ArrayHashMap eql function accept an additional param
2022-02-01 13:20:28 -05:00
Andrew Kelley
75bbc74d56 a small crusade against std.meta.declarations 2022-01-31 22:25:49 -07:00
Andrew Kelley
aa326328d0 stage1: remove the data field from TypeInfo.Declaration
Partially implements #10706
2022-01-31 22:09:41 -07:00
Andrew Kelley
c46f7588ce std.fmt.parseHexFloat: clean up bitwise logic
* fold a couple separate operations into one
 * use const instead of var
 * naming conventions
2022-01-31 20:59:32 -07:00
Mateusz Radomski
7f024d6786 std: correct rounding in parse_hex_float.zig 2022-01-31 20:59:32 -07:00
John Schmidt
8e497eb32c debug: fix edge cases in macOS debug symbol lookup
This commit fixes two related things:

1. If the loop goes all the way through the slice without a match, on
   the last iteration `mid == symbols.len - 1` which causes
   `&symbols[mid + 1]` to be out of bounds. End one step before that
   instead.

2. If the address we're looking for is greater than the address of the
   last symbol in the slice, we now match it to that symbol. Previously,
   we would miss this case since we only matched if the address was _in
   between_ the address of two symbols.
2022-01-31 23:55:19 +01:00
Žiga Željko
5210b9074c os,wasi: use wasi-libc if available 2022-01-31 22:54:30 +01:00
Andrew Kelley
cf88cf2657 std: make ArrayHashMap eql function accept an additional param
which is the index of the key that already exists in the hash map.

This enables the use case of using `AutoArrayHashMap(void, void)` which
may seem surprising at first, but is actually pretty handy!
This commit includes a proof-of-concept of how I want to use it, with a
new InternArena abstraction for stage2 that provides a compact way to
store values (and types) in an "internment arena", thus making types
stored exactly once (per arena), representable with a single u32 as a
reference to a type within an InternArena, and comparable with a
simple u32 integer comparison. If both types are in the same
InternArena, you can check if they are equal by seeing if their index is
the same.

What's neat about `AutoArrayHashMap(void, void)` is that it allows us to
look up the indexes by key, *without actually storing the keys*.
Instead, keys are treated as ephemeral values that are constructed as
needed.

As a result, we have an extremely efficient encoding of types and
values, represented only by three arrays, which has no pointers, and can
therefore be serialized and deserialized by a single writev/readv call.
The `map` field is denormalized data and can be computed from the other
two fields.

This is in contrast to our current Type/Value system which makes
extensive use of pointers.

The test at the bottom of InternArena.zig passes in this commit.
2022-01-31 01:20:45 -07:00
PhaseMage
8a97807d68
Full response file (*.rsp) support
I hit the "quotes in an RSP file" issue when trying to compile gRPC using
"zig cc". As a fun exercise, I decided to see if I could fix it myself.
I'm fully open to this code being flat-out rejected. Or I can take feedback
to fix it up.

This modifies (and renames) _ArgIteratorWindows_ in process.zig such that
it works with arbitrary strings (or the contents of an RSP file).

In main.zig, this new _ArgIteratorGeneral_ is used to address the "TODO"
listed in _ClangArgIterator_.

This change closes #4833.

**Pros:**

- It has the nice attribute of handling "RSP file" arguments in the same way it
  handles "cmd_line" arguments.
- High Performance, minimal allocations
- Fixed bug in previous _ArgIteratorWindows_, where final trailing backslashes
  in a command line were entirely dropped
- Added a test case for the above bug
- Harmonized the _ArgIteratorXxxx._initWithAllocator()_ and _next()_ interface
  across Windows/Posix/Wasi (Moved Windows errors to _initWithAllocator()_
  rather than _next()_)
- Likely perf benefit on Windows by doing _utf16leToUtf8AllocZ()_ only once
  for the entire cmd_line

**Cons:**

- Breaking Change in std library on Windows: Call
  _ArgIterator.initWithAllocator()_ instead of _ArgIterator.init()_
- PhaseMage is new with contributions to Zig, might need a lot of hand-holding
- PhaseMage is a Windows person, non-Windows stuff will need to be double-checked

**Testing Done:**

- Wrote a few new test cases in process.zig
- zig.exe build test -Dskip-release (no new failures seen)
- zig cc now builds gRPC without error
2022-01-30 21:27:52 +02:00
Jakub Konka
dd7309bde4
Merge pull request #10404 from ominitay/iterator
std: Fix using `fs.Dir.Iterator` twice
2022-01-30 14:25:50 +01:00
naeu
bdd1a9e48c std: add test for Thread.Semaphore 2022-01-29 20:30:53 +00:00