Commit graph

9004 commits

Author SHA1 Message Date
Andrew Kelley
51e15a9650 std.tar: add option for omitting empty directories 2023-10-08 16:54:30 -07:00
Karl Seguin
d68f39b541
std.unicode.utf8ValidateSlice: optimize implementation (#17329)
Originally inspired by Go's `utf8.Valid` function. Includes some test cases from Go's test suite.

Further optimized to be faster in all tested cases (short/long ascii/UTF8), in all release modes.

Takes advantage of SIMD for the ASCII fast path.
2023-10-06 23:49:21 -04:00
Becker A
5a4a5875dc
Update Server.zig:{listen, do} to specify error enums 2023-10-06 23:47:19 +00:00
Andrew Kelley
41a4908dcc
Merge pull request #17419 from ziglang/unsound-native-target-info
std: fix memory bug in getExternalExecutor
2023-10-06 16:36:45 -07:00
Ratakor
8ce33795e9 Add pause() to linux.zig 2023-10-06 11:49:31 -07:00
Ratakor
bb9a9d8f26 Add filled_sigset to os.zig 2023-10-06 11:49:31 -07:00
Ratakor
cef90eab57 Add filled_sigset to os.linux
filled_sigset is equivalent to sigfillset() as empty_sigset is
equivalent to sigemptyset().
2023-10-06 11:49:31 -07:00
castholm
ad6f8e3a59
std.math: add nextAfter (#16894)
`nextAfter()` returns the next representable value after `x` in the direction of `y` and is a standard math library function ([C++](https://en.cppreference.com/w/cpp/numeric/math/nextafter), [Java](https://docs.oracle.com/javase/8/docs/api/java/lang/Math.html#nextAfter-double-double-)). It is primarily useful for bitwise incrementing/decrementing floats.

This implementation supports runtime integers, runtime floats and `comptime_int`. `comptime_float` is not supported because NaNs/infinities are intentionally difficult to obtain and because I'm not sure if the fact that it's backed by `f128` is supposed to be an implementation detail. Either way, the user could just call the function with the floating-point type whose behavior they want at comptime and then cast the result to `comptime_float`.

The float implementation was ported from mingw-w64 with some slight changes made possible because the Zig standard library doesn't care about raising FP exceptions.

The number of test cases may seem excessive but they should cover every normal and edge case for every float type and are especially important for verifying that `f80` works.
2023-10-06 14:44:47 -04:00
Jakub Konka
df9462690f std: fix memory bug in getExternalExecutor
Until now, we would pass `candidate: NativeTargetInfo` which creates
a copy of the `NativeTargetInfo.DynamicLinker` buffer. We would then
return this buffer in `bad_dl: []const u8` which would goes out-of-scope
the moment we leave this function frame yielding garbage. To fix this,
we just need to remember to pass by const-pointer
`candidate: *const NativeTargetInfo`.
2023-10-06 12:43:00 +02:00
Jacob Young
2a5335d7b6 x86_64: implement C abi for 128-bit integers 2023-10-04 14:42:35 -04:00
Jakub Konka
8b4e3b6aee comp: add support for -fdata-sections 2023-10-04 11:21:56 -07:00
Andrew Kelley
a306bfcd8e
Merge pull request #17344 from ziglang/type-erased-reader
std: add type-erased reader; base GenericReader on it
2023-10-04 11:21:19 -07:00
nikneym
9f0d2f9417 linux: add fanotify API 2023-10-04 13:48:22 +03:00
Ryan Liptak
ec0f76c599 GeneralPurposeAllocator.searchBucket: check current bucket before searching the list
Follow up to #17383. This is a minor optimization that only matters when a small allocation is resized/free'd soon after it is allocated.

The only real difference I was able to observe with this was via a synthetic benchmark that allocates a full bucket and then frees all but one of the slots, over and over in a loop:

Debug build:

Benchmark 1 (9 runs): gpa-degen-master.exe
  measurement          mean ± σ            min … max           outliers         delta
  wall_time           575ms ± 5.19ms     569ms …  583ms          0 ( 0%)        0%
  peak_rss           43.8MB ± 1.37KB    43.8MB … 43.8MB          1 (11%)        0%
Benchmark 2 (10 runs): gpa-degen-search-cur.exe
  measurement          mean ± σ            min … max           outliers         delta
  wall_time           532ms ± 5.55ms     520ms …  539ms          0 ( 0%)        -  7.5% ±  0.9%
  peak_rss           43.8MB ± 65.2KB    43.8MB … 44.0MB          1 (10%)          +  0.0% ±  0.1%

ReleaseFast build:

Benchmark 1 (129 runs): gpa-degen-master-release.exe
  measurement          mean ± σ            min … max           outliers         delta
  wall_time          38.9ms ± 1.12ms    36.7ms … 42.4ms          8 ( 6%)        0%
  peak_rss           23.2MB ± 2.39KB    23.2MB … 23.2MB          0 ( 0%)        0%
Benchmark 2 (151 runs): gpa-degen-search-cur-release.exe
  measurement          mean ± σ            min … max           outliers         delta
  wall_time          33.2ms ±  999us    31.9ms … 36.3ms         20 (13%)        - 14.7% ±  0.6%
  peak_rss           23.2MB ± 2.26KB    23.2MB … 23.2MB          0 ( 0%)          +  0.0% ±  0.0%
2023-10-04 02:55:54 -07:00
Kai Jellinghaus
11489bb04f
Update IORING_OP to reflect upstream (#17388)
Reference [upstream io_uring.h](cbf3a2cb15/include/uapi/linux/io_uring.h (L234))
2023-10-04 09:18:14 +00:00
Andrew Kelley
8ebebbd134 std.macho: remove alignment from LoadCommandIterator 2023-10-03 14:55:17 -07:00
Andrew Kelley
d8540dd708 std: add type-erased reader; base GenericReader on it
The idea here is to avoid code bloat by having only one actual io.Reader
implementation, which is type erased, and then implement a GenericReader
that preserves type information on top of that as thin glue code.

The strategy here is for that glue code to `@errSetCast` the result of
the type-erased reader functions, however, while trying to do that I
ran into #17343.
2023-10-03 14:55:17 -07:00
Andrew Kelley
7733894761
Merge pull request #17341 from rzezeski/illumos-updates
Illumos/Solaris updates
2023-10-03 11:04:41 -07:00
Andrew Kelley
47f08605bd
Merge pull request #17383 from squeek502/gpa-optim-treap
GeneralPurposeAllocator: Considerably improve worst case performance
2023-10-03 10:58:48 -07:00
Andrew Kelley
df4853a627
Merge pull request #17363 from ziglang/tar-symlinks
introduce the `zig fetch` subcommand and symlink support in zig packages
2023-10-03 03:33:26 -07:00
Frank Denis
4930094e62 valgrind.memcheck: fix makeMem*()
The `makeMem*()` functions crashed under valgrind in Debug and
ReleaseSafe modes.

The reason being that `doMemCheckClientRequestExpr()` returns `0`
when not running under Valgrind, and `maxInt(usize)` when running
under Valgrind.

Thus, `@as(i1, @intCast(maxInt(usize)))` always fails and these
functions crashed before returning.

That being said, what these functions used to return was quite
unexpected: `0` on error and `-1` on success (=running under valgrind).
That doesn't match any Zig nor C conventions.

But that return value doesn't seem to be very useful. Either we are
running under Valgrind or we are not. There's no point in checking this
for every single call. Applications are likely to always discard it.

So, just return a `void` instead.

Also avoid function comments that start with `Similarly, ...` because
that doesn't refer to anything in the context of autodoc or in IDEs.
2023-10-03 02:51:01 -07:00
Ryan Liptak
95f4c1532a Treap: do not set key to undefined in remove to allow re-use of removed nodes 2023-10-03 01:21:51 -07:00
Ryan Liptak
cf3572a66b GeneralPurposeAllocator: Considerably improve worst case performance
Before this commit, GeneralPurposeAllocator could run into incredibly degraded performance in scenarios where the bucket count for a particular size class grew to be large. For example, if exactly `slot_count` allocations of a single size class were performed and then all of them were freed except one, then the bucket for those allocations would have to be kept around indefinitely. If that pattern of allocation were done over and over, then the bucket list for that size class could grow incredibly large.

This allocation pattern has been seen in the wild: https://github.com/Vexu/arocc/issues/508#issuecomment-1738275688

In that case, the length of the bucket list for the `128` size class would grow to tens of thousands of buckets and cause Debug runtime to balloon to ~8 minutes whereas with the c_allocator the Debug runtime would be ~3 seconds.

To address this, there are three different changes happening here:

1. std.Treap is used instead of a doubly linked list for the lists of buckets. This takes the time complexity of searchBucket [used in resize and free] from O(n) to O(log n), but increases the time complexity of insert from O(1) to O(log n) [before, all new buckets would get added to the head of the list]. Note: Any data structure with O(log n) or better search/insert/delete would also work for this use-case.
2. If the 'current' bucket for a size class is full, the list of buckets is never traversed and instead a new bucket is allocated. Previously, traversing the bucket list could only find a non-full bucket in specific circumstances, and only because of a separate optimization that is no longer needed (before, after any resize/free, the affected bucket would be moved to the head of the bucket list to allow searchBucket to perform better on average). Now, the current_bucket for each size class only changes when either (1) the current bucket is emptied/freed, or (2) a new bucket is allocated (due to the current bucket being full or null). Because each bucket's alloc_cursor only moves forward (i.e. slots within a bucket are never re-used), we can therefore always know that any bucket besides the current_bucket will be full, so traversing the list in the hopes of finding an existing non-full bucket is entirely pointless.
3. Size + alignment information for small allocations has been moved into the Bucket data instead of keeping it in a separate HashMap. This offers an improvement over the HashMap since whenever we need to get/modify the length/alignment of an allocation it's extremely likely we will already have calculated any bucket-related information necessary to get the data.

The first change is the most relevant and accounts for most of the benefit here. Also note that the overall functionality of GeneralPurposeAllocator is unchanged.

In the degraded `arocc` case, these changes bring Debug performance from ~8 minutes to ~20 seconds.

Benchmark 1: test-master.bat
  Time (mean ± σ):     481.263 s ±  5.440 s    [User: 479.159 s, System: 1.937 s]
  Range (min … max):   477.416 s … 485.109 s    2 runs

Benchmark 2: test-optim-treap.bat
  Time (mean ± σ):     19.639 s ±  0.037 s    [User: 18.183 s, System: 1.452 s]
  Range (min … max):   19.613 s … 19.665 s    2 runs

Summary
  'test-optim-treap.bat' ran
   24.51 ± 0.28 times faster than 'test-master.bat'

Note: Much of the time taken on Windows in this particular case is related to gathering stack traces. With `.stack_trace_frames = 0` the runtime goes down to 6.7 seconds, which is a little more than 2.5x slower compared to when the c_allocator is used.

These changes may or mat not introduce a slight performance regression in the average case:

Here's the standard library tests on Windows in Debug mode:

Benchmark 1 (10 runs): std-tests-master.exe
  measurement          mean ± σ            min … max           outliers         delta
  wall_time          16.0s  ± 30.8ms    15.9s  … 16.1s           1 (10%)        0%
  peak_rss           42.8MB ± 8.24KB    42.8MB … 42.8MB          0 ( 0%)        0%
Benchmark 2 (10 runs): std-tests-optim-treap.exe
  measurement          mean ± σ            min … max           outliers         delta
  wall_time          16.2s  ± 37.6ms    16.1s  … 16.3s           0 ( 0%)        💩+  1.3% ±  0.2%
  peak_rss           42.8MB ± 5.18KB    42.8MB … 42.8MB          0 ( 0%)          +  0.1% ±  0.0%

And on Linux:

Benchmark 1: ./test-master
  Time (mean ± σ):     16.091 s ±  0.088 s    [User: 15.856 s, System: 0.453 s]
  Range (min … max):   15.870 s … 16.166 s    10 runs
 
Benchmark 2: ./test-optim-treap
  Time (mean ± σ):     16.028 s ±  0.325 s    [User: 15.755 s, System: 0.492 s]
  Range (min … max):   15.735 s … 16.709 s    10 runs
 
Summary
  './test-optim-treap' ran
    1.00 ± 0.02 times faster than './test-master'
2023-10-03 01:21:51 -07:00
xdBronch
c9c3ee704c correctly detect apple a15 and a16 chips 2023-10-03 00:36:59 -07:00
Ryan Liptak
da7ecfb2de Treap: Add InorderIterator 2023-10-02 21:11:14 -07:00
Andrew Kelley
21181181bf zig fetch: enhanced error reporting
* Package: use std.tar diagnostics to give detailed error messages
* std.tar: add diagnostic for unsupported file type
2023-10-02 17:02:25 -07:00
Andrew Kelley
ef9966c985 introduce the 'zig fetch' command + symlink support
zig fetch [options] <url>
zig fetch [options] <path>

Fetches a package which is found at <url> or <path> into the global
cache directory, printing the package hash to stdout.

Closes #16972
Related to #14280

Additionally, this commit:

* Adds uncompressed .tar support to package fetching
* Introduces symlink support to package fetching
2023-10-02 17:02:25 -07:00
Andrew Kelley
309c53295f std.fs: give readLink an explicit error set 2023-10-02 17:02:24 -07:00
Andrew Kelley
a5144d19b7 std.tar: support symlinks
closes #16678
2023-10-02 17:02:24 -07:00
Carl Åstholm
412d863ba5 std.Build: expose -idirafter to the build system 2023-10-02 16:22:07 -07:00
Ryan Zezeski
dd026588d0 illumos: fix dynamic linker path 2023-10-02 16:37:37 -06:00
Ryan Zezeski
42ad3e265c illumos does not have versions
The 5.11 in uname is not something that is ever updated. There is no
versioning of the illumos system in general. Illumos prefers to rely
on feature detection.

I can't say what Solaris does these days as I do not work at Oracle;
so I left it alone.
2023-10-02 16:23:17 -06:00
Stephen Gregoratto
285970982a Add illumos OS tag
- Adds `illumos` to the `Target.Os.Tag` enum. A new function,
  `isSolarish` has been added that returns true if the tag is either
  Solaris or Illumos. This matches the naming convention found in Rust's
  `libc` crate[1].
- Add the tag wherever `.solaris` is being checked against.
- Check for the C pre-processor macro `__illumos__` in CMake to set the
  proper target tuple. Illumos distros patch their compilers to have
  this in the "built-in" set (verified with `echo | cc -dM -E -`).

  Alternatively you could check the output of `uname -o`.

Right now, both Solaris and Illumos import from `c/solaris.zig`. In the
future it may be worth putting the shared ABI bits in a base file, and
mixing that in with specific `c/solaris.zig`/`c/illumos.zig` files.

[1]: 6e02a329a2/src/unix/solarish
2023-10-02 15:31:49 -06:00
Veikka Tuominen
63bd2bff12 Sema: add @errorCast which works for both error sets and error unions
Closes #17343
2023-10-01 17:00:01 +03:00
Jay Petacat
d8bfbbbf25 std.mem.zeroes: Zero out entire extern union, including padding
Fixes #17258
2023-10-01 02:39:05 -07:00
Andrew Kelley
376242e586
Merge pull request #17161 from tiehuis/vectorize-index-of-scalar
std.mem: add vectorized indexOfScalarPos and indexOfSentinel
2023-10-01 00:07:57 -07:00
Lucas Santos
303181901b Improve (Unmanaged)ArrayList.insert
(Unmanaged)ArrayList.insert has the same inefficiency as the old insertSlice. With the new addManyAt, the solution is trivial.
Also improves the test "growing memory preserves contents". In the previous implementation, if any changes were made to the ArrayList memory growth policy (function growMemory), the list could end up with enough capacity to not trigger a memory growth, defeating the purpose of the test. The new implementation more robustly triggers a memory growth.
2023-09-30 16:17:22 -07:00
Ryan Zezeski
54ad5f31c6 solaris: hard-code ABI and dynamic linker
Solaris/illumos is multi-lib, so you can't rely on an arbitrary
executable to give you the correct dynamic linker. Besides, it's
always the same path.
2023-09-30 11:38:56 -06:00
Ryan Zezeski
68bcd7ddd4 solaris: load CA certs file 2023-09-30 11:38:56 -06:00
Ryan Zezeski
c17ebdca6a solaris: fix path component max 2023-09-30 11:38:56 -06:00
Ryan Zezeski
f0724229d6 solaris: add missing registers 2023-09-30 11:38:56 -06:00
Marc Tiehuis
08635f08a9 fix indexOfSentinel alignment for types larger than 1 byte 2023-09-30 22:15:47 +13:00
Marc Tiehuis
5b5da0ef8c std.mem: check backend vector support for indexOfSentinel/indexOfScalarPos 2023-09-30 21:22:12 +13:00
Marc Tiehuis
cd766513fe std.mem: add vectorized indexOfScalarPos and indexOfSentinel
These are an order of magnitude quicker than the previous
implementations:

A relative comparison of each, measuring scanning a 1G file.

    Reading 1G (1.0000000009313226GiB)

             std.mem.sliceTo: 281.232ms
          vectorized.sliceTo: 24.769ms
                      strlen: 24.291ms

           std.indexOfScalar: 229.016ms
    vectorized.indexOfScalar: 24.685ms
                      memchr: 24.958ms
2023-09-30 21:19:43 +13:00
Andrew Kelley
101df768a0
Merge pull request #17312 from LucasSantos91/master
Fix inefficiency with ArrayList.insertSlice
2023-09-29 18:15:24 -07:00
Krzysztof Wolicki
19a82ffdba
Add include_extensions to InstallDir Options (#17300)
closes #16687
2023-09-29 18:50:37 -04:00
Andrew Kelley
9013970861 std.ArrayList: fixups for the previous commit
* Move `computeBetterCapacity` to the bottom so that `pub` stuff shows
   up first.
 * Rename `computeBetterCapacity` to `growCapacity`. Every function
   implicitly computes something; that word is always redundant in a
   function name. "better" is vague. Better in what way? Instead we
   describe what is actually happening. "grow".
 * Improve doc comments to be very explicit about when element pointers
   are invalidated or not.
 * Rename `addManyAtIndex` to `addManyAt`. The parameter is named
   `index`; that is enough.
 * Extract some duplicated code into `addManyAtAssumeCapacity` and make
   it `pub`.
 * Since I audited every line of code for correctness, I changed the
   style to my personal preference.
 * Avoid a redundant `@memset` to `undefined` - memory allocation does
   that already.
 * Fixed comment giving the wrong reason for not calling
   `ensureTotalCapacity`.
2023-09-29 13:42:38 -07:00
Lucas Santos
9d765b5ab5 std.ArrayList: insertSlice avoids extra memcpy
Includes a more robust implementation of replaceRange, which updates the
ArrayListUnmanaged if state changes in the managed part of the code
before returning an error.

Co-authored-by: Andrew Kelley <andrew@ziglang.org>
2023-09-29 12:52:40 -07:00
Krzysztof Wolicki
e919fbea9f
Step.Run: fix assert of the wrong value (#17303)
closes #16866
2023-09-29 14:14:42 -04:00
Adam Goertz
2f0e5b00b0 Allow only relative paths.
This commit makes the following changes:
* Disallow file:/// URIs
* Allow only relative paths in the .path field of build.zig.zon
* Remote now-unneeded shlwapi dependency
2023-09-29 00:32:43 -07:00