This commit fixes two related things:
1. If the loop goes all the way through the slice without a match, on
the last iteration `mid == symbols.len - 1` which causes
`&symbols[mid + 1]` to be out of bounds. End one step before that
instead.
2. If the address we're looking for is greater than the address of the
last symbol in the slice, we now match it to that symbol. Previously,
we would miss this case since we only matched if the address was _in
between_ the address of two symbols.
The buffer `buf` contains N (= `slice_sizes.len`) slices followed by the
N null-terminated arguments. The N null-terminated arguments are stored
in the `contents` array list. Thus, `buf` size should be:
@sizeOf([]u8) * slice_sizes.len + contents_slice.len
Instead of:
@sizeOf([]u8) * slice_sizes.len + contents_slice.len + slice_sizes.len
This bug was found thanks to the gpa allocator which checks if freed
size matches allocated sizes for large allocations.
This commit fixes an out of bounds write that can occur when
formatting certain float values. The write messes up the stack and
causes incorrect results, segfaults, or nothing at all, depending on the
optimization mode used.
The `errol` function writes the digits of the float into `buffer`
starting from index 1, leaving index 0 untouched, and returns `buffer[1..]`
and the exponent. This is because `roundToPrecision` relies on index 0 being
unused in case the rounding adds a digit (e.g rounding 999.99
to 1000.00). When this happens, pointer arithmetic is used
[here](0e6d2184ca/lib/std/fmt/errol.zig (L61-L65))
to access index 0 and put the ones digit in the right place.
However, `errol3u` contains two special cases: `errolInt` and `errolFixed`,
which return from the function early. For these two special cases
index 0 was never reserved, and the return value contains `buffer`
instead of `buffer[1..]`. This causes the pointer arithmetic in
`roundToPrecision` to write out of bounds, which in the case of
`std.fmt.formatFloatDecimal` messes up the stack and causes undefined behavior.
The fix is to move the slicing of `buffer` to `buffer[1..]` from `errol3u`
to `errol` so that both the default and the special cases operate on the sliced
buffer.
- Add an `Metadata.isFree` helper method.
- Implement `Metadata.isTombstone` and `Metadata.isFree` with `@bitCast` then comparing to a constant. I assume `@bitCast`-then-compare is faster than the old method because it only involves one comparison, and doesn't require bitmasking.
- Summary of benchmarked changes (`gotta-go-fast`, run locally, compared to master):
- 3/4 of the hash map benchmarks used ~10% fewer cycles
- The last one (project Euler) shows 4% fewer cycles.
copy_cqes() is not guaranteed to return as many CQEs as provided in the
`wait_nr` argument, meaning the assert in `copy_cqe` can trigger.
Instead, loop until we do get at least one CQE returned.
This mimics the behaviour of liburing's _io_uring_get_cqe.
This saves on comptime format string parsing, as the compiler caches
comptime calls. The catch here, is that parsePlaceHolder cannot take the
placeholder string as a slice. It must take it as an array by value for
the caching to occure.
There is also some logic in here that ensures that the specifier_arg is
always them same slice when the items they contain are the same. This
makes the compiler stamp out less copies of formatType.
For renameat, unlinkat, mkdirat, symlinkat and linkat the error code
differs between kernel 5.4 which returns EBADF and kernel 5.10 which returns EINVAL.
Fixes#10466
Make `@returnAddress()` return for the BPF target, as the BPF target for
the time being does not support probing for the return address. Stack
traces for the general purpose allocator for the BPF target is also set
to not be captured.
Before, `std.Progress` was printing unwanted stuff to stderr. Now, the
test runner's logic to detect whether we should print each test as a
separate line to stderr is properly activated.
The status quo for the `build.zig` build system is preserved in
the sense that, if the user does not explicitly override
`dylib.setInstallName(...);` in their build script, the default
of `@rpath/libname.dylib` applies. However, should they want to
override the default behaviour, they can either:
1) unset it with
```dylib.setIntallName(null);```
2) set it to an explicit string with
```dylib.setInstallName("somename.dylib");```
When it comes to the command line however, the default is not to
use `@rpath` for the install name when creating a dylib. The user
will now be required to explicitly specify the `@rpath` as part
of the desired install name should they choose so like so:
1) with `build-lib`
```
zig build-lib -dynamic foo.zig -install_name @rpath/libfoo.dylib
```
2) with `cc`
```
zig cc -shared foo.c -o libfoo.dylib -Wl,"-install_name=@rpath/libfoo.dylib"
```
This reverts commit 11803a3a56.
Observations from the performance dashboard:
* strictly worse in terms of CPU instructions
* slightly worse wall time (but this can be noisy)
* sometimes better, sometimes worse for branch predictions
Given that the commit was introducing complexity for optimization's
sake, these performance changes do not seem worth it.
See https://github.com/ziglang/zig/pull/10337 for context.
In #10337 the `available` tracking fix necessitated an additional condition on the probe loop in both `getOrPut` and `getIndex` to prevent an infinite loop. Previously, this condition was implicit thanks to the guaranteed presence of a free slot.
The new condition hurts the `HashMap` benchmarks (https://github.com/ziglang/zig/pull/10337#issuecomment-996432758).
This commit removes that extra condition on the loop. Instead, when probing, first check whether the "home" slot is the target key — if so, return it. Otherwise, save the home slot's metadata to the stack and temporarily "free" the slot (but don't touch its value). Then continue with the original loop. Once again, the loop will be implicitly broken by the new "free" slot. The original metadata is restored before the function returns.
`getOrPut` has one additional gotcha — if the home slot is a tombstone and `getOrPut` misses, then the home slot is is written with the new key; that is, its original metadata (the tombstone) is not restored.
Other changes:
- Test hash map misses.
- Test using `getOrPutAssumeCapacity` to get keys at the end (along with `get`).