Commit graph

29 commits

Author SHA1 Message Date
Dongjia Zhang
535a0c4270 use correcct symbol for the end of pcguard section 2025-04-30 00:04:22 +02:00
tjog
2e35fdd032 fuzz: fix expected section start/end symbol name on MacOS when linking libfuzzer
Not only is the section name when adding the sancov variables different.

The linker symbol ending up in the binary is also different.

Reference: 60105ac6ba/llvm/lib/Transforms/Instrumentation/SanitizerCoverage.cpp (L1076-L1104)
2025-04-26 14:35:03 +02:00
Linus Groh
79460d4a3e Remove uses of deprecated callconv aliases 2025-03-05 03:01:43 +00:00
Xavier Bouchoux
84ece5624a fix -fsanitize-coverage-trace-pc-guard and fuzzer support for C compile units
- allow `-fsanitize-coverage-trace-pc-guard` to be used on its own without enabling the fuzzer.
   (note that previouly, while the flag was only active when fuzzing, the fuzzer itself doesn't use it, and the code will not link as is.)

 - add stub functions in the fuzzer to link with instrumented C code (previously fuzzed tests failed to link if they were calling into C):
   while the zig compile unit uses a custom `EmitOptions.Coverage` with features disabled,
   the C code is built calling into the clang driver with "-fsanitize=fuzzer-no-link" that automatically enables the default features.
	(see de06978ebc/clang/lib/Driver/SanitizerArgs.cpp (L587))

 - emit `-fsanitize-coverage=trace-pc-guard` instead of `-Xclang -fsanitize-coverage-trace-pc-guard` so that edge coverrage is enabled by clang driver. (previously, it was enabled only because the fuzzer was)
2025-02-21 06:06:26 +01:00
Andrew Kelley
d789f1e5cf fuzzer: write inputs to shared memory before running
breaking change to the fuzz testing API; it now passes a type-safe
context parameter to the fuzz function.

libfuzzer is reworked to select inputs from the entire corpus.

I tested that it's roughly as good as it was before in that it can find
the panics in the simple examples, as well as achieve decent coverage on
the tokenizer fuzz test.

however I think the next step here will be figuring out why so many
points of interest are missing from the tokenizer in both Debug and
ReleaseSafe modes.

does not quite close #20803 yet since there are some more important
things to be done, such as opening the previous corpus, continuing
fuzzing after finding bugs, storing the length of the inputs, etc.
2025-02-11 13:39:20 -08:00
Andrew Kelley
284de7d957 adjust runtime page size APIs
* fix merge conflicts
* rename the declarations
* reword documentation
* extract FixedBufferAllocator to separate file
* take advantage of locals
* remove the assertion about max alignment in Allocator API, leaving it
  Allocator implementation defined
* fix non-inline function call in start logic

The GeneralPurposeAllocator implementation is totally broken because it
uses global state but I didn't address that in this commit.
2025-02-06 14:23:23 -08:00
Archbirdplus
439667be04 runtime page size detection
heap.zig: define new default page sizes
heap.zig: add min/max_page_size and their options
lib/std/c: add miscellaneous declarations
heap.zig: add pageSize() and its options
switch to new page sizes, especially in GPA/stdlib
mem.zig: remove page_size
2025-02-06 14:23:23 -08:00
Jonathan Hallstrom
7ebfc72186 fix type of std_options 2024-11-05 23:46:10 +01:00
Linus Groh
8588964972 Replace deprecated default initializations with decl literals 2024-09-12 16:01:23 +01:00
Andrew Kelley
2d005827b8 make lowest stack an internal libfuzzer detail
This value is useful to help determine run uniqueness in the face of
recursion, however it is not valuable to expose to the fuzzing UI.
2024-09-11 13:41:29 -07:00
Andrew Kelley
2b76221a46 libfuzzer: use a function pointer instead of extern
solves the problem presented in the previous commit message
2024-09-11 13:41:29 -07:00
Andrew Kelley
892ce7ef52 rework fuzzing API
The previous API used `std.testing.fuzzInput(.{})` however that has the
problem that users call it multiple times incorrectly, and there might
be work happening to obtain the corpus which should not be included in
coverage analysis, and which must not slow down iteration speed.

This commit restructures it so that the main loop lives in libfuzzer and
directly calls the "test one" function.

In this commit I was a little too aggressive because I made the test
runner export `fuzzer_one` for this purpose. This was motivated by
performance, but it causes "exported symbol collision: fuzzer_one" to
occur when more than one fuzz test is provided.

There are three ways to solve this:

1. libfuzzer needs to be passed a function pointer instead. Possible
   performance downside.

2. build runner needs to build a different process per fuzz test.
   Potentially wasteful and unclear how to isolate them.

3. test runner needs to perform a relocation at runtime to point the
   function call to the relevant unit test. Portability issues and
   dubious performance gains.
2024-09-11 13:41:29 -07:00
Andrew Kelley
b8d99a3323 implement code coverage instrumentation manually
instead of relying on the LLVM sancov pass. The LLVM pass is still
executed if trace_pc_guard is requested, disabled otherwise. The LLVM
backend emits the instrumentation directly.

It uses `__sancov_pcs1` symbol name instead of `__sancov_pcs` because
each element is 1 usize instead of 2.

AIR: add CoveragePoint to branch hints which indicates whether those
branches are interesting for code coverage purposes.

Update libfuzzer to use the new instrumentation. It's simplified since
we no longer need the constructor and the pcs are now in a continguous
list.

This is a regression in the fuzzing functionality because the
instrumentation for comparisons is no longer emitted, resulting in worse
fuzzer inputs generated. A future commit will add that instrumentation
back.
2024-08-28 18:07:13 -07:00
mlugg
0fe3fd01dd
std: update std.builtin.Type fields to follow naming conventions
The compiler actually doesn't need any functional changes for this: Sema
does reification based on the tag indices of `std.builtin.Type` already!
So, no zig1.wasm update is necessary.

This change is necessary to disallow name clashes between fields and
decls on a type, which is a prerequisite of #9938.
2024-08-28 08:39:59 +01:00
Andrew Kelley
f6f1ecf0f9 more optimized and correct management of 8-bit PC counters
* Upgrade from u8 to usize element types.
  - WebAssembly assumes u64. It should probably try to be target-aware
    instead.
* Move the covered PC bits to after the header so it goes on the same
  page with the other rapidly changing memory (the header stats).

depends on the semantics of accepted proposal #19755

closes #20994
2024-08-08 21:46:36 -07:00
Andrew Kelley
4e32edbff5 fuzzing: comptime assertions to protect the ABI
compile errors are nice
2024-08-08 21:46:36 -07:00
Andrew Kelley
529df8c007 libfuzzer: fix looking at wrong memory for pc counters
this fix bypasses the slice bounds, reading garbage data for up to the
last 7 bits (which are technically supposed to be ignored). that's going
to need to be fixed, let's fix that along with switching from byte elems
to usize elems.
2024-08-07 00:48:32 -07:00
Andrew Kelley
dec7e45f7c fuzzer web UI: receive coverage information
* libfuzzer: track unique runs instead of deduplicated runs
  - easier for consumers to notice when to recheck the covered bits.
* move common definitions to `std.Build.Fuzz.abi`.

build runner sends all the information needed to fuzzer web interface
client needed in order to display inline coverage information along with
source code.
2024-08-07 00:48:32 -07:00
Andrew Kelley
517cfb0dd1 fuzzing: progress towards web UI
* libfuzzer: close file after mmap
* fuzzer/main.js: connect with EventSource and debug dump the messages.
  currently this prints how many fuzzer runs have been attempted to
  console.log.
* extract some `std.debug.Info` logic into `std.debug.Coverage`.
  Prepares for consolidation across multiple different executables which
  share source files, and makes it possible to send all the
  PC/SourceLocation mapping data with 4 memcpy'd arrays.
* std.Build.Fuzz:
  - spawn a thread to watch the message queue and signal event
    subscribers.
  - track coverage map data
  - respond to /events URL with EventSource messages on a timer
2024-08-07 00:48:32 -07:00
Andrew Kelley
e0ffac4e3c introduce a web interface for fuzzing
* new .zig-cache subdirectory: 'v'
  - stores coverage information with filename of hash of PCs that want
    coverage. This hash is a hex encoding of the 64-bit coverage ID.
* build runner
  * fixed bug in file system inputs when a compile step has an
    overridden zig_lib_dir field set.
  * set some std lib options optimized for the build runner
    - no side channel mitigations
    - no Transport Layer Security
    - no crypto fork safety
  * add a --port CLI arg for choosing the port the fuzzing web interface
    listens on. it defaults to choosing a random open port.
  * introduce a web server, and serve a basic single page application
    - shares wasm code with autodocs
    - assets are created live on request, for convenient development
      experience. main.wasm is properly cached if nothing changes.
    - sources.tar comes from file system inputs (introduced with the
      `--watch` feature)
  * receives coverage ID from test runner and sends it on a thread-safe
    queue to the WebServer.
* test runner
  - takes a zig cache directory argument now, for where to put coverage
    information.
  - sends coverage ID to parent process
* fuzzer
  - puts its logs (in debug mode) in .zig-cache/tmp/libfuzzer.log
  - computes coverage_id and makes it available with
    `fuzzer_coverage_id` exported function.
  - the memory-mapped coverage file is now namespaced by the coverage id
    in hex encoding, in `.zig-cache/v`
* tokenizer
  - add a fuzz test to check that several properties are upheld
2024-08-07 00:48:32 -07:00
Andrew Kelley
ffc050e055 fuzzer: log errors and move deduplicated runs to shared mem 2024-08-07 00:48:32 -07:00
Andrew Kelley
97643c1ecc fuzzer: track code coverage from all runs
When a unique run is encountered, track it in a bit set memory-mapped
into the fuzz directory so it can be observed by other processes, even
while the fuzzer is running.
2024-08-07 00:48:32 -07:00
Andrew Kelley
688c2df646 fuzzer: use the cmp values
seems to provide better scoring
2024-07-25 18:52:21 -07:00
Andrew Kelley
6a63372053 fuzzer: basic implementation
just some experimentation. I didn't expect this to be effective so
quickly but it already can find a comparison made with mem.eql
2024-07-25 18:52:21 -07:00
Andrew Kelley
a3c74aca99 add --debug-rt CLI arg to the compiler + bonus edits
The flag makes compiler_rt and libfuzzer be in debug mode.

Also:
* fuzzer: override debug logs and disable debug logs for frequently
  called functions
* std.Build.Fuzz: fix bug of rerunning the old unit test binary
* report errors from rebuilding the unit tests better
* link.Elf: additionally add tsan lib and fuzzer lib to the hash
2024-07-25 18:52:21 -07:00
Andrew Kelley
6f3767862d implement std.testing.fuzzInput
For now this returns a dummy fuzz input.
2024-07-25 18:52:20 -07:00
Andrew Kelley
dbbe2f1094 libfuzzer: log all the libcalls to stderr 2024-07-22 14:26:17 -07:00
Andrew Kelley
7930efc60b libfuzzer: implement enough symbols for hello world 2024-07-22 13:07:02 -07:00
Andrew Kelley
54b7e144b1 initial support for integrated fuzzing
* Add the `-ffuzz` and `-fno-fuzz` CLI arguments.
* Detect fuzz testing flags from zig cc.
* Set the correct clang flags when fuzz testing is requested. It can be
  combined with TSAN and UBSAN.
* Compilation: build fuzzer library when needed which is currently an
  empty zig file.
* Add optforfuzzing to every function in the llvm backend for modules
  that have requested fuzzing.
* In ZigLLVMTargetMachineEmitToFile, add the optimization passes for
  sanitizer coverage.
* std.mem.eql uses a naive implementation optimized for fuzzing when
  builtin.fuzz is true.

Tracked by #20702
2024-07-22 13:07:02 -07:00