Previously, `setAlignment` would set the value to 1 fewer than it should, so if you were intending to set alignment to 8 bytes, it would actually set it to 4 bytes, etc.
This enables depth-related use cases without any dependency on the Walker's internal stack which doesn't always pertain to the actual depth of the current entry (i.e. recursing into a directory immediately affects the stack).
Some decision-making might depend on the level of the traversal, so
it makes sense to expose depth here since it's stable, and not in the
automatic walker where it's not.
This matches all other platforms. Even if this field is defined as 'int'
in the C definition, the expectation is that the full 32-bit unsigned
integer range can be used. In particular this Sigaction initializer in
the new std.debug code was causing a build failure:
```zig
.flags = (posix.SA.SIGINFO | posix.SA.RESTART | posix.SA.RESETHAND)
```
Implements deflate compression from scratch. A history window is kept in
the writer's buffer for matching and a chained hash table is used to
find matches. Tokens are accumulated until a threshold is reached and
then outputted as a block. Flush is used to indicate end of stream.
Additionally, two other deflate writers are provided:
* `Raw` writes only in store blocks (the uncompressed bytes). It
utilizes data vectors to efficiently send block headers and data.
* `Huffman` only performs Huffman compression on data and no matching.
The above are also able to take advantage of writer semantics since they
do not need to keep a history.
Literal and distance code parameters in `token` have also been reworked.
Their parameters are now derived mathematically, however the more
expensive ones are still obtained through a lookup table (expect on
ReleaseSmall).
Decompression bit reading has been greatly simplified, taking advantage
of the ability to peek on the underlying reader. Additionally, a few
bugs with limit handling have been fixed.
There were only a few dozen lines of common logic, and they frankly
introduced more complexity than they eliminated. Instead, let's accept
that the implementations of `SelfInfo` are all pretty different and want
to track different state. This probably fixes some synchronization and
memory bugs by simplifying a bunch of stuff. It also improves the DWARF
unwind cache, making it around twice as fast in a debug build with the
self-hosted x86_64 backend, because we no longer have to redundantly go
through the hashmap lookup logic to find the module. Unwinding on
Windows will also see a slight performance boost from this change,
because `RtlVirtualUnwind` does not need to know the module whatsoever,
so the old `SelfInfo` implementation was doing redundant work. Lastly,
this makes it even easier to implement `SelfInfo` on freestanding
targets; there is no longer a need to emulate a real module system,
since the user controls the whole implementation!
There are various other small refactors here in the `SelfInfo`
implementations as well as in the DWARF unwinding logic. This change
turned out to make a lot of stuff simpler!
Apparently the `__eh_frame` in Mach-O binaries doesn't include the
terminator entry, but in all other respects it acts like `.eh_frame`
rather than `.debug_frame`. I have no idea.
By my estimation, these changes speed up DWARF unwinding when using the
self-hosted x86_64 backend by around 7x. There are two very significant
enhancements: we no longer iterate frames which don't fit in the stack
trace buffer, and we cache register rules (in a fixed buffer) to avoid
re-parsing and evaluating CFI instructions in most cases. Alongside this
are a bunch of smaller enhancements, such as pre-caching the result of
evaluating the CIE's initial instructions, avoiding re-parsing of CIEs,
and big simplifications to the `Dwarf.Unwind.VirtualMachine` logic.
This logic was causing some occasional infinite looping on ARM, where
the `.debug_frame` section is often incomplete since the `.exidx`
section is used for unwind information. But the information we're
getting from the compiler is totally *valid*: it's leaving the rule as
the default, which is (as with most architectures) equivalent to
`.undefined`!
This has been a TODO for ages, but in the past it didn't really matter
because stack traces are typically printed to stderr for which a mutex
is held so in practice there was a mutex guarding usage of `SelfInfo`.
However, now that `SelfInfo` is also used for simply capturing traces,
thread safety is needed. Instead of just a single mutex, though, there
are a couple of different mutexes involved; this helps make critical
sections smaller, particularly when unwinding the stack as `unwindFrame`
doesn't typically need to hold any lock at all.
Calling `current` here causes compilation failures as the C backend
currently does not emit valid MSVC inline assembly. This change means
that when building for MSVC with the self-hosted C backend, only FP
unwinding can be used.
Processes should reasonably be able to expect their children to abort
with typical exit codes, rather than a debugger breakpoint signal. This
flag in the PEB is what would be checked by `IsDebuggerPresent` in
kernel32, which is the function you would typically use for this
purpose.
This fixes `test-stack-trace` failures on Windows, as these tests were
expecting exit code 3 to indicate abort.
...and just deal with signal handlers by adding 1 to create a fake
"return address". The system I tried out where the addresses returned by
`StackIterator` were pre-subtracted didn't play nicely with error
traces, which in hindsight, makes perfect sense. This definition also
removes some ugly off-by-one issues in matching `first_address`, so I do
think this is a better approach.