rework fuzz testing to be smith based

-- On the standard library side:

The `input: []const u8` parameter of functions passed to `testing.fuzz`
has changed to `smith: *testing.Smith`. `Smith` is used to generate
values from libfuzzer or input bytes generated by libfuzzer.

`Smith` contains the following base methods:
* `value` as a generic method for generating any type
* `eos` for generating end-of-stream markers. Provides the additional
  guarantee `true` will eventually by provided.
* `bytes` for filling a byte array.
* `slice` for filling part of a buffer and providing the length.

`Smith.Weight` is used for giving value ranges a higher probability of
being selected. By default, every value has a weight of zero (i.e. they
will not be selected). Weights can only apply to values that fit within
a u64. The above functions have corresponding ones that accept weights.
Additionally, the following functions are provided:
* `baselineWeights` which provides a set of weights containing every
  possible value of a type.
* `eosSimpleWeighted` for unique weights for `true` and `false`
* `valueRangeAtMost` and `valueRangeLessThan` for weighing only a range
  of values.

-- On the libfuzzer and abi side:

--- Uids

These are u32s which are used to classify requested values. This solves
the problem of a mutation causing a new value to be requested and
shifting all future values; for example:

1. An initial input contains the values 1, 2, 3 which are interpreted
as a, b, and c respectively by the test.

2. The 1 is mutated to a 4 which causes the test to request an extra
value interpreted as d. The input is now 4, 2, 3, 5 (new value) which
the test corresponds to a, d, b, c; however, b and c no longer
correspond to their original values.

Uids contain a hash component and type component. The hash component
is currently determined in `Smith` by taking a hash of the calling
`@returnAddress()` or via an argument in the corresponding `WithHash`
functions. The type component is used extensively in libfuzzer with its
hashmaps.

--- Mutations

At the start of a cycle (a run), a random number of values to mutate is
selected with less being exponentially more likely. The indexes of the
values are selected from a selected uid with a logarithmic bias to uids
with more values.

Mutations may change a single values, several consecutive values in a
uid, or several consecutive values in the uid-independent order they
were requested. They may generate random values, mutate from previous
ones, or copy from other values in the same uid from the same input or
spliced from another.

For integers, mutations from previous ones currently only generates
random values. For bytes, mutations from previous mix new random data
and previous bytes with a set number of mutations.

--- Passive Minimization

A different approach has been taken for minimizing inputs: instead of
trying a fixed set of mutations when a fresh input is found, the input
is instead simply added to the corpus and removed when it is no longer
valuable.

The quality of an input is measured based off how many unique pcs it
hit and how many values it needed from the fuzzer. It is tracked which
inputs hold the best qualities for each pc for hitting the minimum and
maximum unique pcs while needing the least values.

Once all an input's qualities have been superseded for the pcs it hit,
it is removed from the corpus.

-- Comparison to byte-based smith

A byte-based smith would be much more inefficient and complex than this
solution. It would be unable to solve the shifting problem that Uids
do. It is unable to provide values from the fuzzer past end-of-stream.
Even with feedback, it would be unable to act on dynamic weights which
have proven essential with the updated tests (e.g. to constrain values
to a range).

-- Test updates

All the standard library tests have been updated to use the new smith
interface. For `Deque`, an ad hoc allocator was written to improve
performance and remove reliance on heap allocation. `TokenSmith` has
been added to aid in testing Ast and help inform decisions on the smith
interface.
This commit is contained in:
Kendall Condon 2025-09-22 19:04:02 -04:00
parent bb88f7bc4e
commit 93775de45f
17 changed files with 3296 additions and 1497 deletions

View file

@ -370,7 +370,7 @@ var fuzz_amount_or_instance: u64 = undefined;
pub fn fuzz(
context: anytype,
comptime testOne: fn (context: @TypeOf(context), []const u8) anyerror!void,
comptime testOne: fn (context: @TypeOf(context), *std.testing.Smith) anyerror!void,
options: testing.FuzzInputOptions,
) anyerror!void {
// Prevent this function from confusing the fuzzer by omitting its own code
@ -397,12 +397,12 @@ pub fn fuzz(
const global = struct {
var ctx: @TypeOf(context) = undefined;
fn test_one(input: fuzz_abi.Slice) callconv(.c) void {
fn test_one() callconv(.c) void {
@disableInstrumentation();
testing.allocator_instance = .{};
defer if (testing.allocator_instance.deinit() == .leak) std.process.exit(1);
log_err_count = 0;
testOne(ctx, input.toSlice()) catch |err| switch (err) {
testOne(ctx, @constCast(&testing.Smith{ .in = null })) catch |err| switch (err) {
error.SkipZigTest => return,
else => {
std.debug.lockStdErr();
@ -422,13 +422,11 @@ pub fn fuzz(
const prev_allocator_state = testing.allocator_instance;
testing.allocator_instance = .{};
defer testing.allocator_instance = prev_allocator_state;
global.ctx = context;
fuzz_abi.fuzzer_init_test(&global.test_one, .fromSlice(builtin.test_functions[fuzz_test_index].name));
fuzz_abi.fuzzer_set_test(&global.test_one, .fromSlice(builtin.test_functions[fuzz_test_index].name));
for (options.corpus) |elem|
fuzz_abi.fuzzer_new_input(.fromSlice(elem));
fuzz_abi.fuzzer_main(fuzz_mode, fuzz_amount_or_instance);
return;
}
@ -436,10 +434,12 @@ pub fn fuzz(
// When the unit test executable is not built in fuzz mode, only run the
// provided corpus.
for (options.corpus) |input| {
try testOne(context, input);
var smith: testing.Smith = .{ .in = input };
try testOne(context, &smith);
}
// In case there is no provided corpus, also use an empty
// string as a smoke test.
try testOne(context, "");
var smith: testing.Smith = .{ .in = "" };
try testOne(context, &smith);
}

File diff suppressed because it is too large Load diff

View file

@ -16,12 +16,32 @@ test "simple test" {
}
test "fuzz example" {
const Context = struct {
fn testOne(context: @This(), input: []const u8) anyerror!void {
try std.testing.fuzz({}, testOne, .{});
}
fn testOne(context: void, smith: *std.testing.Smith) !void {
_ = context;
// Try passing `--fuzz` to `zig build test` and see if it manages to fail this test case!
try std.testing.expect(!std.mem.eql(u8, "canyoufindme", input));
}
const gpa = std.testing.allocator;
var list: std.ArrayList(u8) = .empty;
defer list.deinit(gpa);
while (!smith.eos()) switch (smith.value(enum { add_data, dup_data })) {
.add_data => {
const slice = try list.addManyAsSlice(gpa, smith.value(u4));
smith.bytes(slice);
},
.dup_data => {
if (list.items.len == 0) continue;
if (list.items.len > std.math.maxInt(u32)) return error.SkipZigTest;
const len = smith.valueRangeAtMost(u32, 1, @min(32, list.items.len));
const off = smith.valueRangeAtMost(u32, 0, @intCast(list.items.len - len));
try list.appendSlice(gpa, list.items[off..][0..len]);
try std.testing.expectEqualSlices(
u8,
list.items[off..][0..len],
list.items[list.items.len - len ..],
);
},
};
try std.testing.fuzz(Context{}, Context.testOne, .{});
}

View file

@ -6,6 +6,7 @@
//! All of these components interface to some degree via an ABI:
//! * The build runner communicates with the web interface over a WebSocket connection
//! * The build runner communicates with `libfuzzer` over a shared memory-mapped file
const std = @import("std");
// Check that no WebSocket message type has implicit padding bits. This ensures we never send any
// undefined bits over the wire, and also helps validate that the layout doesn't differ between, for
@ -13,7 +14,6 @@
comptime {
const check = struct {
fn check(comptime T: type) void {
const std = @import("std");
std.debug.assert(@typeInfo(T) == .@"struct");
std.debug.assert(@typeInfo(T).@"struct".layout == .@"extern");
std.debug.assert(std.meta.hasUniqueRepresentation(T));
@ -139,14 +139,48 @@ pub const Rebuild = extern struct {
/// ABI bits specifically relating to the fuzzer interface.
pub const fuzz = struct {
pub const TestOne = *const fn (Slice) callconv(.c) void;
pub const TestOne = *const fn () callconv(.c) void;
/// A unique value to identify the related requests across runs
pub const Uid = packed struct(u32) {
kind: enum(u1) { int, bytes },
hash: u31,
pub const hashmap_ctx = struct {
pub fn hash(_: @This(), u: Uid) u32 {
// We can ignore `kind` since `hash` should be unique regardless
return u.hash;
}
pub fn eql(_: @This(), a: Uid, b: Uid, _: usize) bool {
return a == b;
}
};
};
pub extern fn fuzzer_init(cache_dir_path: Slice) void;
/// `fuzzer_init` must be called first.
pub extern fn fuzzer_coverage() Coverage;
pub extern fn fuzzer_init_test(test_one: TestOne, unit_test_name: Slice) void;
/// `fuzzer_init` must be called first.
pub extern fn fuzzer_set_test(test_one: TestOne, unit_test_name: Slice) void;
/// `fuzzer_set_test` must be called first.
/// The callee owns the memory of bytes and must not free it until `fuzzer_main` returns
pub extern fn fuzzer_new_input(bytes: Slice) void;
/// `fuzzer_set_test` must be called first.
/// Resets the fuzzer's state to that of `fuzzer_init`.
pub extern fn fuzzer_main(limit_kind: LimitKind, amount: u64) void;
pub extern fn fuzzer_unslide_address(addr: usize) usize;
pub extern fn fuzzer_int(uid: Uid, weights: Weights) u64;
pub extern fn fuzzer_eos(uid: Uid, weights: Weights) bool;
pub extern fn fuzzer_bytes(uid: Uid, out: MutSlice, weights: Weights) void;
pub extern fn fuzzer_slice(
uid: Uid,
buf: MutSlice,
len_weights: Weights,
byte_weights: Weights,
) u32;
pub const Slice = extern struct {
ptr: [*]const u8,
len: usize,
@ -160,6 +194,100 @@ pub const fuzz = struct {
}
};
pub const MutSlice = extern struct {
ptr: [*]u8,
len: usize,
pub fn toSlice(s: MutSlice) []u8 {
return s.ptr[0..s.len];
}
pub fn fromSlice(s: []u8) MutSlice {
return .{ .ptr = s.ptr, .len = s.len };
}
};
pub const Weights = extern struct {
ptr: [*]const Weight,
len: usize,
pub fn toSlice(s: Weights) []const Weight {
return s.ptr[0..s.len];
}
pub fn fromSlice(s: []const Weight) Weights {
return .{ .ptr = s.ptr, .len = s.len };
}
};
/// Increases the probability of values being selected by the fuzzer.
///
/// `weight` applies to each value in the range (i.e. not evenly across
/// the range) and must be nonzero.
///
/// In a set of weights, the total weight must not exceed 2^64 and be
/// nonzero.
pub const Weight = extern struct {
/// Inclusive
min: u64,
/// Inclusive
max: u64,
weight: u64,
fn intFromValue(x: anytype) u64 {
const T = @TypeOf(x);
return switch (@typeInfo(T)) {
.comptime_int => x,
.bool => @intFromBool(x),
.@"enum" => @intFromEnum(x),
else => @as(std.meta.Int(.unsigned, @bitSizeOf(T)), @bitCast(x)),
.int => |i| x: {
comptime {
if (i.signedness == .signed) {
@compileError("type does not have a continous range: " ++ @typeName(T));
}
// Reject types that don't have a fixed bitsize (esp. usize)
// since they are not gauraunteed to fit in a u64 across targets.
if (std.mem.indexOfScalar(type, &.{
usize, c_char, c_ushort, c_uint, c_ulong, c_ulonglong,
}, T) != null) {
@compileError("type does not have a fixed bitsize: " ++ @typeName(T));
}
}
break :x x;
},
.comptime_float,
.float,
=> @compileError("type does not have a continous range: " ++ @typeName(T)),
.pointer => @compileError("type does not have a fixed bitsize: " ++ @typeName(T)),
};
}
pub fn value(T: type, x: T, weight: u64) Weight {
return .{ .min = intFromValue(x), .max = intFromValue(x), .weight = weight };
}
pub fn rangeAtMost(T: type, at_least: T, at_most: T, weight: u64) Weight {
std.debug.assert(intFromValue(at_least) <= intFromValue(at_most));
return .{
.min = intFromValue(at_least),
.max = intFromValue(at_most),
.weight = weight,
};
}
pub fn rangeLessThan(T: type, at_least: T, less_than: T, weight: u64) Weight {
std.debug.assert(intFromValue(at_least) < intFromValue(less_than));
return .{
.min = intFromValue(at_least),
.max = intFromValue(less_than) - 1,
.weight = weight,
};
}
};
pub const LimitKind = enum(u8) { forever, iterations };
/// libfuzzer uses this and its usize is the one that counts. To match the ABI,

View file

@ -279,7 +279,7 @@ pub fn init(
assert(buffer.len >= flate.max_window_len);
// note that disallowing some of these simplifies matching logic
assert(opts.chain != 0); // use `Huffman`, disallowing this simplies matching
assert(opts.chain != 0); // use `Huffman`; disallowing this simplies matching
assert(opts.good >= 3 and opts.nice >= 3); // a match will (usually) not be found
assert(opts.good <= 258 and opts.nice <= 258); // a longer match will not be found
assert(opts.lazy <= opts.nice); // a longer match will (usually) not be found
@ -558,45 +558,35 @@ test betterMatchLen {
try std.testing.fuzz({}, testFuzzedMatchLen, .{});
}
fn testFuzzedMatchLen(_: void, input: []const u8) !void {
fn testFuzzedMatchLen(_: void, smith: *std.testing.Smith) !void {
@disableInstrumentation();
var r: Io.Reader = .fixed(input);
var buf: [1024]u8 = undefined;
var w: Writer = .fixed(&buf);
var old = r.takeLeb128(u9) catch 0;
var bytes_off = @max(1, r.takeLeb128(u10) catch 258);
const prev_back = @max(1, r.takeLeb128(u10) catch 258);
while (r.takeByte()) |byte| {
const op: packed struct(u8) {
kind: enum(u2) { splat, copy, insert_imm, insert },
imm: u6,
pub fn immOrByte(op_s: @This(), r_s: *Io.Reader) usize {
return if (op_s.imm == 0) op_s.imm else @as(usize, r_s.takeByte() catch 0) + 64;
}
} = @bitCast(byte);
(switch (op.kind) {
.splat => w.splatByteAll(r.takeByte() catch 0, op.immOrByte(&r)),
while (w.unusedCapacityLen() != 0 and !smith.eosWeightedSimple(7, 1)) {
switch (smith.value(enum(u2) { splat, copy, insert })) {
.splat => w.splatByteAll(
smith.value(u8),
smith.valueRangeAtMost(u9, 1, @min(511, w.unusedCapacityLen())),
) catch unreachable,
.copy => write: {
const start = w.buffered().len -| op.immOrByte(&r);
const len = @min(w.buffered().len - start, r.takeByte() catch 3);
break :write w.writeAll(w.buffered()[start..][0..len]);
if (w.buffered().len == 0) continue;
const start = smith.valueRangeAtMost(u10, 0, @intCast(w.buffered().len - 1));
const max_len = @min(w.unusedCapacityLen(), w.buffered().len - start);
const len = smith.valueRangeAtMost(u10, 1, @intCast(max_len));
break :write w.writeAll(w.buffered()[start..][0..len]) catch unreachable;
},
.insert_imm => w.writeByte(op.imm),
.insert => w.writeAll(r.take(
@min(r.bufferedLen(), @as(usize, op.imm) + 1),
) catch unreachable),
}) catch break;
} else |_| {}
.insert => w.advance(smith.slice(w.unusedCapacitySlice())),
}
}
w.splatByteAll(0, (1 + token.min_length) -| w.buffered().len) catch unreachable;
w.splatByteAll(0, (1 + 3) -| w.buffered().len) catch unreachable;
bytes_off = @min(bytes_off, @as(u10, @intCast(w.buffered().len - 3)));
const prev_off = bytes_off -| prev_back;
assert(prev_off < bytes_off);
const max_start = w.buffered().len - token.min_length;
const bytes_off = smith.valueRangeAtMost(u10, 1, @intCast(max_start));
const prev_off = smith.valueRangeAtMost(u10, 0, bytes_off - 1);
const prev = w.buffered()[prev_off..];
const bytes = w.buffered()[bytes_off..];
old = @min(old, bytes.len - 1, token.max_length - 1);
const old = smith.valueRangeLessThan(u10, 0, @min(bytes.len, token.max_length));
const diff_index = mem.indexOfDiff(u8, prev, bytes).?; // unwrap since lengths are not same
const expected_len = @min(diff_index, 258);
@ -1036,7 +1026,7 @@ const huffman = struct {
max_bits: u4,
incomplete_allowed: bool,
) struct { u32, u16 } {
assert(out_codes.len - 1 >= @intFromBool(incomplete_allowed));
assert(out_codes.len - 1 >= @intFromBool(!incomplete_allowed));
// freqs and out_codes are in the loop to assert they are all the same length
for (freqs, out_codes, out_bits) |_, _, n| assert(n == 0);
assert(out_codes.len <= @as(u16, 1) << max_bits);
@ -1255,40 +1245,35 @@ const huffman = struct {
try std.testing.fuzz({}, checkFuzzedBuildFreqs, .{});
}
fn checkFuzzedBuildFreqs(_: void, freqs: []const u8) !void {
fn checkFuzzedBuildFreqs(_: void, smith: *std.testing.Smith) !void {
@disableInstrumentation();
var r: Io.Reader = .fixed(freqs);
var freqs_limit: u16 = 65535;
var freqs_buf: [max_leafs]u16 = undefined;
var nfreqs: u15 = 0;
const params: packed struct(u8) {
max_bits: u4,
_: u3,
incomplete_allowed: bool,
} = @bitCast(r.takeByte() catch 255);
while (nfreqs != freqs_buf.len) {
const leb = r.takeLeb128(u16);
const f = if (leb) |f| @min(f, freqs_limit) else |e| switch (e) {
error.ReadFailed => unreachable,
error.EndOfStream => 0,
error.Overflow => freqs_limit,
};
const incomplete_allowed = smith.value(bool);
while (nfreqs < @as(u8, @intFromBool(!incomplete_allowed)) + 1 or
nfreqs != freqs_buf.len and freqs_limit != 0 and
smith.eosWeightedSimple(15, 1))
{
const f = smith.valueWeighted(u16, &.{
.rangeAtMost(u16, 0, @min(31, freqs_limit), @max(freqs_limit, 1)),
.rangeAtMost(u16, 0, freqs_limit, 1),
});
freqs_buf[nfreqs] = f;
nfreqs += 1;
freqs_limit -= f;
if (leb == error.EndOfStream and nfreqs - 1 > @intFromBool(params.incomplete_allowed))
break;
nfreqs += 1;
}
var codes_buf: [max_leafs]u16 = undefined;
var bits_buf: [max_leafs]u4 = @splat(0);
const max_bits = smith.valueRangeAtMost(u4, math.log2_int_ceil(u15, nfreqs), 15);
const total_bits, const last_nonzero = build(
freqs_buf[0..nfreqs],
codes_buf[0..nfreqs],
bits_buf[0..nfreqs],
@max(math.log2_int_ceil(u15, nfreqs), params.max_bits),
params.incomplete_allowed,
max_bits,
incomplete_allowed,
);
var has_bitlen_one: bool = false;
@ -1303,21 +1288,21 @@ const huffman = struct {
}
errdefer std.log.err(
\\ params: {}
\\ incomplete_allowed: {}
\\ max_bits: {}
\\ freqs: {any}
\\ bits: {any}
\\ # freqs: {}
\\ max bits: {}
\\ weighted sum: {}
\\ has_bitlen_one: {}
\\ expected/actual total bits: {}/{}
\\ expected/actual last nonzero: {?}/{}
++ "\n", .{
params,
incomplete_allowed,
max_bits,
freqs_buf[0..nfreqs],
bits_buf[0..nfreqs],
nfreqs,
@max(math.log2_int_ceil(u15, nfreqs), params.max_bits),
weighted_sum,
has_bitlen_one,
expected_total_bits,
@ -1331,7 +1316,7 @@ const huffman = struct {
if (weighted_sum > 1 << 15)
return error.OversubscribedHuffmanTree;
if (weighted_sum < 1 << 15 and
!(params.incomplete_allowed and has_bitlen_one and weighted_sum == 1 << 14))
!(incomplete_allowed and has_bitlen_one and weighted_sum == 1 << 14))
return error.IncompleteHuffmanTree;
}
};
@ -1353,6 +1338,7 @@ fn testingFreqBufs() !*[2][65536]u8 {
}
return fbufs;
}
const FreqBufIndex = enum(u1) { gradient, random };
fn testingCheckDecompressedMatches(
flate_bytes: []const u8,
@ -1426,34 +1412,31 @@ test Compress {
try std.testing.fuzz(fbufs, testFuzzedCompressInput, .{});
}
fn testFuzzedCompressInput(fbufs: *const [2][65536]u8, input: []const u8) !void {
var in: Io.Reader = .fixed(input);
var opts: packed struct(u51) {
container: PackedContainer,
buf_size: u16,
good: u8,
nice: u8,
lazy: u8,
/// Not a `u16` to limit it for performance
chain: u9,
} = @bitCast(in.takeLeb128(u51) catch 0);
var expected_hash: flate.Container.Hasher = .init(opts.container.val());
fn testFuzzedCompressInput(fbufs: *const [2][65536]u8, smith: *std.testing.Smith) !void {
@disableInstrumentation();
const container = smith.value(flate.Container);
const good = smith.valueRangeAtMost(u16, 3, 258);
const nice = smith.valueRangeAtMost(u16, 3, 258);
const lazy = smith.valueRangeAtMost(u16, 3, nice);
const chain = smith.valueWeighted(u16, &.{
.rangeAtMost(u16, if (good <= lazy) 4 else 1, 255, 65536),
// The following weights are greatly reduced since they increasing take more time to run
.rangeAtMost(u16, 256, 4095, 256),
.rangeAtMost(u16, 4096, 32767 + 256, 1),
});
var expected_hash: flate.Container.Hasher = .init(container);
var expected_size: u32 = 0;
var flate_buf: [128 * 1024]u8 = undefined;
var flate_w: Writer = .fixed(&flate_buf);
var deflate_buf: [flate.max_window_len * 2]u8 = undefined;
var deflate_w = try Compress.init(
&flate_w,
deflate_buf[0 .. flate.max_window_len + @as(usize, opts.buf_size)],
opts.container.val(),
.{
.good = @as(u16, opts.good) + 3,
.nice = @as(u16, opts.nice) + 3,
.lazy = @as(u16, @min(opts.lazy, opts.nice)) + 3,
.chain = @max(1, opts.chain, @as(u8, 4) * @intFromBool(opts.good <= opts.lazy)),
},
);
const bufsize = smith.valueRangeAtMost(u32, flate.max_window_len, @intCast(deflate_buf.len));
var deflate_w = try Compress.init(&flate_w, deflate_buf[0..bufsize], container, .{
.good = good,
.nice = nice,
.lazy = lazy,
.chain = chain,
});
// It is ensured that more bytes are not written then this to ensure this run
// does not take too long and that `flate_buf` does not run out of space.
@ -1465,79 +1448,57 @@ fn testFuzzedCompressInput(fbufs: *const [2][65536]u8, input: []const u8) !void
// extra 32 bytes is reserved on top of that for container headers and footers.
const max_size = flate_buf.len - (flate_buf_blocks * 64 + 32);
while (true) {
const data: packed struct(u36) {
is_rebase: bool,
is_bytes: bool,
params: packed union {
copy: packed struct(u34) {
len_lo: u5,
dist: u15,
len_hi: u4,
_: u10,
},
bytes: packed struct(u34) {
kind: enum(u1) { gradient, random },
off_hi: u4,
len_lo: u10,
off_mi: u4,
len_hi: u5,
off_lo: u8,
_: u2,
},
rebase: packed struct(u34) {
preserve: u17,
capacity: u17,
},
},
} = @bitCast(in.takeLeb128(u36) catch |e| switch (e) {
error.ReadFailed => unreachable,
error.Overflow => 0,
error.EndOfStream => break,
});
while (!smith.eosWeightedSimple(7, 1)) {
const max_bytes = max_size -| expected_size;
if (max_bytes == 0) break;
const buffered = deflate_w.writer.buffered();
// Required for repeating patterns and since writing from `buffered` is illegal
var copy_buf: [512]u8 = undefined;
if (data.is_rebase) {
const usable_capacity = deflate_w.writer.buffer.len - rebase_reserved_capacity;
const preserve = @min(data.params.rebase.preserve, usable_capacity);
const capacity = @min(data.params.rebase.capacity, usable_capacity -
@max(rebase_min_preserve, preserve));
try deflate_w.writer.rebase(preserve, capacity);
continue;
}
const max_bytes = max_size -| expected_size;
const bytes = if (!data.is_bytes and buffered.len != 0) bytes: {
const dist = @min(buffered.len, @as(u32, data.params.copy.dist) + 1);
const len = @min(
@max(@shlExact(@as(u9, data.params.copy.len_hi), 5) | data.params.copy.len_lo, 1),
max_bytes,
);
// Reuse the implementation's history. Otherwise our own would need maintained.
const bytes_start = buffered[buffered.len - dist ..];
const history_bytes = bytes_start[0..@min(bytes_start.len, len)];
const bytes = bytes: switch (smith.valueRangeAtMost(
u2,
@intFromBool(buffered.len == 0),
2,
)) {
0 => { // Copy
const start = smith.valueRangeLessThan(u32, 0, @intCast(buffered.len));
// Reuse the implementation's history; otherwise, our own would need maintained.
const from = buffered[start..];
const len = smith.valueRangeAtMost(u16, 1, @min(copy_buf.len, max_bytes));
const history_bytes = from[0..@min(from.len, len)];
@memcpy(copy_buf[0..history_bytes.len], history_bytes);
const new_history = len - history_bytes.len;
if (history_bytes.len != len) for ( // check needed for `- dist`
copy_buf[history_bytes.len..][0..new_history],
copy_buf[history_bytes.len - dist ..][0..new_history],
const repeat_len = len - history_bytes.len;
for (
copy_buf[history_bytes.len..][0..repeat_len],
copy_buf[0..repeat_len],
) |*next, prev| {
next.* = prev;
};
}
break :bytes copy_buf[0..len];
} else bytes: {
const off = @shlExact(@as(u16, data.params.bytes.off_hi), 12) |
@shlExact(@as(u16, data.params.bytes.off_mi), 8) |
data.params.bytes.off_lo;
const len = @shlExact(@as(u16, data.params.bytes.len_hi), 10) |
data.params.bytes.len_lo;
const fbuf = &fbufs[@intFromEnum(data.params.bytes.kind)];
break :bytes fbuf[off..][0..@min(len, fbuf.len - off, max_bytes)];
},
1 => { // Bytes
const fbuf = &fbufs[
smith.valueWeighted(u1, &.{
.value(FreqBufIndex, .gradient, 3),
.value(FreqBufIndex, .random, 1),
})
];
const len = smith.valueRangeAtMost(u32, 1, @min(fbuf.len, max_bytes));
const off = smith.valueRangeAtMost(u32, 0, @intCast(fbuf.len - len));
break :bytes fbuf[off..][0..len];
},
2 => { // Rebase
const rebaseable = bufsize - rebase_reserved_capacity;
const capacity = smith.valueRangeAtMost(u32, 1, rebaseable - rebase_min_preserve);
const preserve = smith.valueRangeAtMost(u32, 0, rebaseable - capacity);
try deflate_w.writer.rebase(preserve, capacity);
continue;
},
else => unreachable,
};
assert(bytes.len <= max_bytes);
try deflate_w.writer.writeAll(bytes);
expected_hash.update(bytes);
@ -1780,7 +1741,8 @@ fn countVec(data: []const []const u8) usize {
return bytes;
}
fn testFuzzedRawInput(data_buf: *const [4 * 65536]u8, input: []const u8) !void {
fn testFuzzedRawInput(data_buf: *const [4 * 65536]u8, smith: *std.testing.Smith) !void {
@disableInstrumentation();
const HashedStoreWriter = struct {
writer: Writer,
state: enum {
@ -1819,8 +1781,8 @@ fn testFuzzedRawInput(data_buf: *const [4 * 65536]u8, input: []const u8) !void {
/// Note that this implementation is somewhat dependent on the implementation of
/// `Raw` by expecting headers / footers to be continous in data elements. It
/// also expects the header to be the same as `flate.Container.header` and not
/// for multiple streams to be concatenated.
/// also expects the header to be the same as `flate.Container.header` and for
/// multiple streams to not be concatenated.
fn drain(w: *Writer, data: []const []const u8, splat: usize) Writer.Error!usize {
errdefer w.* = .failing;
var h: *@This() = @fieldParentPtr("writer", w);
@ -1909,102 +1871,110 @@ fn testFuzzedRawInput(data_buf: *const [4 * 65536]u8, input: []const u8) !void {
}
fn flush(w: *Writer) Writer.Error!void {
defer w.* = .failing; // Clears buffer even if state hasn't reached `end`
defer w.* = .failing; // Empties buffer even if state hasn't reached `end`
_ = try @This().drain(w, &.{""}, 0);
}
};
var in: Io.Reader = .fixed(input);
const opts: packed struct(u19) {
container: PackedContainer,
buf_len: u17,
} = @bitCast(in.takeLeb128(u19) catch 0);
var output: HashedStoreWriter = .init(&.{}, opts.container.val());
var r_buf: [2 * 65536]u8 = undefined;
var r: Raw = try .init(
&output.writer,
r_buf[0 .. opts.buf_len +% flate.max_window_len],
opts.container.val(),
);
var data_base: u18 = 0;
var expected_hash: flate.Container.Hasher = .init(opts.container.val());
const container = smith.value(flate.Container);
var output: HashedStoreWriter = .init(&.{}, container);
var expected_hash: flate.Container.Hasher = .init(container);
var expected_size: u32 = 0;
// 10 maximum blocks is the choosen limit since it is two more
// than the maximum the implementation can output in one drain.
const max_size = 10 * @as(u32, Raw.max_block_size);
var raw_buf: [2 * @as(usize, Raw.max_block_size)]u8 = undefined;
const raw_buf_len = smith.valueWeighted(u32, &.{
.value(u32, 0, @intCast(raw_buf.len)), // unbuffered
.rangeAtMost(u32, 0, @intCast(raw_buf.len), 1),
});
var raw: Raw = try .init(&output.writer, raw_buf[0..raw_buf_len], container);
const data_buf_len: u32 = @intCast(data_buf.len);
var vecs: [32][]const u8 = undefined;
var vecs_n: usize = 0;
while (in.seek != in.end) {
const VecInfo = packed struct(u58) {
output: bool,
/// If set, `data_len` and `splat` are reinterpreted as `capacity`
/// and `preserve_len` respectively and `output` is treated as set.
rebase: bool,
block_aligning_len: bool,
block_aligning_splat: bool,
data_len: u18,
splat: u18,
data_off: u18,
while (true) {
const Op = packed struct {
drain: bool = false,
add_vec: bool = false,
rebase: bool = false,
pub const drain_only: @This() = .{ .drain = true };
pub const add_vec_only: @This() = .{ .add_vec = true };
pub const add_vec_and_drain: @This() = .{ .add_vec = true, .drain = true };
pub const drain_and_rebase: @This() = .{ .drain = true, .rebase = true };
};
var vec_info: VecInfo = @bitCast(in.takeLeb128(u58) catch |e| switch (e) {
error.ReadFailed => unreachable,
error.Overflow, error.EndOfStream => 0,
const is_eos = expected_size == max_size or smith.eosWeightedSimple(7, 1);
var op: Op = if (!is_eos) smith.valueWeighted(Op, &.{
.value(Op, .add_vec_only, 6),
.value(Op, .add_vec_and_drain, 1),
.value(Op, .drain_and_rebase, 1),
}) else .drain_only;
if (op.add_vec) {
const max_write = max_size - expected_size;
const buffered: u32 = @intCast(raw.writer.buffered().len + countVec(vecs[0..vecs_n]));
const to_align = Raw.max_block_size - buffered % Raw.max_block_size;
assert(to_align != 0); // otherwise, not helpful.
const max_data = @min(data_buf_len, max_write);
const len = smith.valueWeighted(u32, &.{
.rangeAtMost(u32, 0, max_data, 1),
.rangeAtMost(u32, 0, @min(Raw.max_block_size, max_data), 4),
.value(u32, @min(to_align, max_data), max_data), // @min 2nd arg is an edge-case
});
const off = smith.valueRangeAtMost(u32, 0, data_buf_len - len);
{
const buffered = r.writer.buffered().len + countVec(vecs[0..vecs_n]);
const to_align = mem.alignForwardAnyAlign(usize, buffered, Raw.max_block_size) - buffered;
assert((buffered + to_align) % Raw.max_block_size == 0);
if (vec_info.block_aligning_len) {
vec_info.data_len = @intCast(to_align);
} else if (vec_info.block_aligning_splat and vec_info.data_len != 0 and
to_align % vec_info.data_len == 0)
{
vec_info.splat = @divExact(@as(u18, @intCast(to_align)), vec_info.data_len) -% 1;
}
}
var splat = if (vec_info.output and !vec_info.rebase) vec_info.splat +% 1 else 1;
add_vec: {
if (vec_info.rebase) break :add_vec;
if (expected_size +| math.mulWide(u18, vec_info.data_len, splat) >
10 * (1 << 16))
{
// Skip this vector to avoid this test taking too long.
// 10 maximum sized blocks is choosen as the limit since it is two more
// than the maximum the implementation can output in one drain.
splat = 1;
break :add_vec;
}
vecs[vecs_n] = data_buf[@min(
data_base +% vec_info.data_off,
data_buf.len - vec_info.data_len,
)..][0..vec_info.data_len];
data_base +%= vec_info.data_len +% 3; // extra 3 to help catch aliasing bugs
for (0..splat) |_| expected_hash.update(vecs[vecs_n]);
expected_size += @as(u32, @intCast(vecs[vecs_n].len)) * splat;
expected_size += len;
vecs[vecs_n] = data_buf[off..][0..len];
vecs_n += 1;
op.drain |= vecs_n == vecs.len;
}
const want_drain = vecs_n == vecs.len or vec_info.output or vec_info.rebase or
in.seek == in.end;
if (want_drain and vecs_n != 0) {
try r.writer.writeSplatAll(vecs[0..vecs_n], splat);
op.drain |= is_eos;
op.drain &= vecs_n != 0;
if (op.drain) {
const pattern_len: u32 = @intCast(vecs[vecs_n - 1].len);
const pattern_len_z = @max(pattern_len, 1);
const max_write = max_size - (expected_size - pattern_len);
const buffered: u32 = @intCast(raw.writer.buffered().len + countVec(vecs[0 .. vecs_n - 1]));
const to_align = Raw.max_block_size - buffered % Raw.max_block_size;
assert(to_align != 0); // otherwise, not helpful.
const max_splat = max_write / pattern_len_z;
const weights: [3]std.testing.Smith.Weight = .{
.rangeAtMost(u32, 0, max_splat, 1),
.rangeAtMost(u32, 0, @min(
Raw.max_block_size + pattern_len_z,
max_write,
) / pattern_len_z, 4),
.value(u32, to_align / pattern_len_z, max_splat * 4),
};
const align_weight = to_align % pattern_len_z == 0 and to_align <= max_write;
const n_weights = @as(u8, 2) + @intFromBool(align_weight);
const splat = smith.valueWeighted(u32, weights[0..n_weights]);
expected_size = expected_size - pattern_len + pattern_len * splat; // splat may be zero
for (vecs[0 .. vecs_n - 1]) |v| expected_hash.update(v);
for (0..splat) |_| expected_hash.update(vecs[vecs_n - 1]);
try raw.writer.writeSplatAll(vecs[0..vecs_n], splat);
vecs_n = 0;
} else assert(splat == 1);
if (vec_info.rebase) {
try r.writer.rebase(vec_info.data_len, @min(
r.writer.buffer.len -| vec_info.data_len,
vec_info.splat,
));
}
}
try r.writer.flush();
if (op.rebase) {
const capacity = smith.valueRangeAtMost(u32, 0, raw_buf_len);
const preserve = smith.valueRangeAtMost(u32, 0, raw_buf_len - capacity);
try raw.writer.rebase(preserve, capacity);
}
if (is_eos) break;
}
try raw.writer.flush();
try output.writer.flush();
try std.testing.expectEqual(.end, output.state);
@ -2432,120 +2402,146 @@ test Huffman {
try std.testing.fuzz(fbufs, testFuzzedHuffmanInput, .{});
}
fn fuzzedHuffmanDrainSpaceLimit(max_drain: usize, written: usize, eos: bool) usize {
var block_lim = math.divCeil(usize, max_drain, Huffman.max_tokens) catch unreachable;
block_lim = @max(block_lim, @intFromBool(eos));
const footer_overhead = @as(u8, 8) * @intFromBool(eos);
// 6 for a raw block header (the block header may span two bytes)
return written + 6 * block_lim + max_drain + footer_overhead;
}
/// This function is derived from `testFuzzedRawInput` with a few changes for fuzzing `Huffman`.
fn testFuzzedHuffmanInput(fbufs: *const [2][65536]u8, input: []const u8) !void {
var in: Io.Reader = .fixed(input);
const opts: packed struct(u19) {
container: PackedContainer,
buf_len: u17,
} = @bitCast(in.takeLeb128(u19) catch 0);
fn testFuzzedHuffmanInput(fbufs: *const [2][65536]u8, smith: *std.testing.Smith) !void {
@disableInstrumentation();
const container = smith.value(flate.Container);
var flate_buf: [2 * 65536]u8 = undefined;
var flate_w: Writer = .fixed(&flate_buf);
var h_buf: [2 * 65536]u8 = undefined;
var h: Huffman = try .init(
&flate_w,
h_buf[0 .. opts.buf_len +% flate.max_window_len],
opts.container.val(),
);
var expected_hash: flate.Container.Hasher = .init(opts.container.val());
var expected_hash: flate.Container.Hasher = .init(container);
var expected_size: u32 = 0;
const max_size = 4 * @as(u32, Huffman.max_tokens);
var h_buf: [2 * @as(usize, Huffman.max_tokens)]u8 = undefined;
const h_buf_len = smith.valueWeighted(u32, &.{
.value(u32, 0, @intCast(h_buf.len)), // unbuffered
.rangeAtMost(u32, 0, @intCast(h_buf.len), 1),
});
var h: Huffman = try .init(&flate_w, h_buf[0..h_buf_len], container);
var vecs: [32][]const u8 = undefined;
var vecs_n: usize = 0;
while (in.seek != in.end) {
const VecInfo = packed struct(u55) {
output: bool,
/// If set, `data_len` and `splat` are reinterpreted as `capacity`
/// and `preserve_len` respectively and `output` is treated as set.
rebase: bool,
block_aligning_len: bool,
block_aligning_splat: bool,
data_off_hi: u8,
random_data: u1,
data_len: u16,
splat: u18,
/// This is less useful as each value is part of the same gradient 'step'
data_off_lo: u8,
while (true) {
const Op = packed struct {
drain: bool = false,
add_vec: bool = false,
rebase: bool = false,
pub const drain_only: @This() = .{ .drain = true };
pub const add_vec_only: @This() = .{ .add_vec = true };
pub const add_vec_and_drain: @This() = .{ .add_vec = true, .drain = true };
pub const drain_and_rebase: @This() = .{ .drain = true, .rebase = true };
};
var vec_info: VecInfo = @bitCast(in.takeLeb128(u55) catch |e| switch (e) {
error.ReadFailed => unreachable,
error.Overflow, error.EndOfStream => 0,
const is_eos = expected_size == max_size or smith.eosWeightedSimple(7, 1);
var op: Op = if (!is_eos) smith.valueWeighted(Op, &.{
.value(Op, .add_vec_only, 6),
.value(Op, .add_vec_and_drain, 1),
.value(Op, .drain_and_rebase, 1),
}) else .drain_only;
if (op.add_vec) {
const max_write = max_size - expected_size;
const buffered: u32 = @intCast(h.writer.buffered().len + countVec(vecs[0..vecs_n]));
const to_align = Huffman.max_tokens - buffered % Huffman.max_tokens;
assert(to_align != 0); // otherwise, not helpful.
const data_buf = &fbufs[
smith.valueWeighted(u1, &.{
.value(FreqBufIndex, .gradient, 3),
.value(FreqBufIndex, .random, 1),
})
];
const data_buf_len: u32 = @intCast(data_buf.len);
const max_data = @min(data_buf_len, max_write);
const len = smith.valueWeighted(u32, &.{
.rangeAtMost(u32, 0, max_data, 1),
.rangeAtMost(u32, 0, @min(Huffman.max_tokens, max_data), 4),
.value(u32, @min(to_align, max_data), max_data), // @min 2nd arg is an edge-case
});
const off = smith.valueRangeAtMost(u32, 0, data_buf_len - len);
{
const buffered = h.writer.buffered().len + countVec(vecs[0..vecs_n]);
const to_align = mem.alignForwardAnyAlign(usize, buffered, Huffman.max_tokens) - buffered;
assert((buffered + to_align) % Huffman.max_tokens == 0);
if (vec_info.block_aligning_len) {
vec_info.data_len = @intCast(to_align);
} else if (vec_info.block_aligning_splat and vec_info.data_len != 0 and
to_align % vec_info.data_len == 0)
{
vec_info.splat = @divExact(@as(u18, @intCast(to_align)), vec_info.data_len) -% 1;
}
}
var splat = if (vec_info.output and !vec_info.rebase) vec_info.splat +% 1 else 1;
add_vec: {
if (vec_info.rebase) break :add_vec;
if (expected_size +| math.mulWide(u18, vec_info.data_len, splat) > 4 * (1 << 16)) {
// Skip this vector to avoid this test taking too long.
splat = 1;
break :add_vec;
}
const data_buf = &fbufs[vec_info.random_data];
vecs[vecs_n] = data_buf[@min(
(@as(u16, vec_info.data_off_hi) << 8) | vec_info.data_off_lo,
data_buf.len - vec_info.data_len,
)..][0..vec_info.data_len];
for (0..splat) |_| expected_hash.update(vecs[vecs_n]);
expected_size += @as(u32, @intCast(vecs[vecs_n].len)) * splat;
expected_size += len;
vecs[vecs_n] = data_buf[off..][0..len];
vecs_n += 1;
op.drain |= vecs_n == vecs.len;
}
const want_drain = vecs_n == vecs.len or vec_info.output or vec_info.rebase or
in.seek == in.end;
if (want_drain and vecs_n != 0) {
var n = h.writer.buffered().len + Writer.countSplat(vecs[0..vecs_n], splat);
const oos = h.writer.writeSplatAll(vecs[0..vecs_n], splat) == error.WriteFailed;
n -= h.writer.buffered().len;
const block_lim = math.divCeil(usize, n, Huffman.max_tokens) catch unreachable;
const lim = flate_w.end + 6 * block_lim + n; // 6 since block header may span two bytes
if (flate_w.end > lim) return error.OverheadTooLarge;
if (oos) return;
op.drain |= is_eos;
op.drain &= vecs_n != 0;
if (op.drain) {
const pattern_len: u32 = @intCast(vecs[vecs_n - 1].len);
const pattern_len_z = @max(pattern_len, 1);
const max_write = max_size - (expected_size - pattern_len);
const buffered: u32 = @intCast(h.writer.buffered().len + countVec(vecs[0 .. vecs_n - 1]));
const to_align = Huffman.max_tokens - buffered % Huffman.max_tokens;
assert(to_align != 0); // otherwise, not helpful.
const max_splat = max_write / pattern_len_z;
const weights: [3]std.testing.Smith.Weight = .{
.rangeAtMost(u32, 0, max_splat, 1),
.rangeAtMost(u32, 0, @min(
Huffman.max_tokens + pattern_len_z,
max_write,
) / pattern_len_z, 4),
.value(u32, to_align / pattern_len_z, max_splat * 4),
};
const align_weight = to_align % pattern_len_z == 0 and to_align <= max_write;
const n_weights = @as(u8, 2) + @intFromBool(align_weight);
const splat = smith.valueWeighted(u32, weights[0..n_weights]);
expected_size = expected_size - pattern_len + pattern_len * splat; // splat may be zero
for (vecs[0 .. vecs_n - 1]) |v| expected_hash.update(v);
for (0..splat) |_| expected_hash.update(vecs[vecs_n - 1]);
const max_space = fuzzedHuffmanDrainSpaceLimit(
buffered + pattern_len * splat,
flate_w.buffered().len,
false,
);
h.writer.writeSplatAll(vecs[0..vecs_n], splat) catch
return if (max_space <= flate_w.buffer.len) error.OverheadTooLarge else {};
if (flate_w.buffered().len > max_space) return error.OverheadTooLarge;
vecs_n = 0;
} else assert(splat == 1);
if (vec_info.rebase) {
const old_end = flate_w.end;
var n = h.writer.buffered().len;
const oos = h.writer.rebase(vec_info.data_len, @min(
h.writer.buffer.len -| vec_info.data_len,
vec_info.splat,
)) == error.WriteFailed;
n -= h.writer.buffered().len;
const block_lim = math.divCeil(usize, n, Huffman.max_tokens) catch unreachable;
const lim = old_end + 6 * block_lim + n; // 6 since block header may span two bytes
if (flate_w.end > lim) return error.OverheadTooLarge;
if (oos) return;
}
}
{
const old_end = flate_w.end;
const n = h.writer.buffered().len;
const oos = h.writer.flush() == error.WriteFailed;
assert(h.writer.buffered().len == 0);
const block_lim = @max(1, math.divCeil(usize, n, Huffman.max_tokens) catch unreachable);
const lim = old_end + 6 * block_lim + n + opts.container.val().footerSize();
if (flate_w.end > lim) return error.OverheadTooLarge;
if (oos) return;
if (op.rebase) {
const capacity = smith.valueRangeAtMost(u32, 0, h_buf_len);
const preserve = smith.valueRangeAtMost(u32, 0, h_buf_len - capacity);
const max_space = fuzzedHuffmanDrainSpaceLimit(
h.writer.buffered().len,
flate_w.buffered().len,
false,
);
h.writer.rebase(preserve, capacity) catch
return if (max_space <= flate_w.buffer.len) error.OverheadTooLarge else {};
if (flate_w.buffered().len > max_space) return error.OverheadTooLarge;
}
if (is_eos) break;
}
const max_space = fuzzedHuffmanDrainSpaceLimit(
h.writer.buffered().len,
flate_w.buffered().len,
true,
);
h.writer.flush() catch
return if (max_space <= flate_w.buffer.len) error.OverheadTooLarge else {};
if (flate_w.buffered().len > max_space) return error.OverheadTooLarge;
try testingCheckDecompressedMatches(flate_w.buffered(), expected_size, expected_hash);
}

View file

@ -414,6 +414,7 @@ pub const CpuContextPtr = if (cpu_context.Native == noreturn) noreturn else *con
/// ReleaseFast and ReleaseSmall mode. Outside of a test block, this assert
/// function is the correct function to use.
pub fn assert(ok: bool) void {
@disableInstrumentation();
if (!ok) unreachable; // assertion failure
}

View file

@ -332,53 +332,137 @@ test "fuzz against ArrayList oracle" {
try std.testing.fuzz({}, fuzzAgainstArrayList, .{});
}
test "dumb fuzz against ArrayList oracle" {
const FuzzAllocator = struct {
smith: *std.testing.Smith,
bufs: [2][256 * 4]u8 align(4),
used_bitmap: u2,
used_len: [2]usize,
pub fn init(smith: *std.testing.Smith) FuzzAllocator {
return .{
.smith = smith,
.bufs = undefined,
.used_len = undefined,
.used_bitmap = 0,
};
}
pub fn allocator(f: *FuzzAllocator) std.mem.Allocator {
return .{
.ptr = f,
.vtable = &.{
.alloc = alloc,
.resize = resize,
.remap = remap,
.free = free,
},
};
}
pub fn allocCount(f: *FuzzAllocator) u2 {
return @popCount(f.used_bitmap);
}
fn alloc(ctx: *anyopaque, len: usize, a: std.mem.Alignment, _: usize) ?[*]u8 {
const f: *FuzzAllocator = @ptrCast(@alignCast(ctx));
assert(a == .@"4");
assert(len % 4 == 0);
const slot: u1 = @intCast(@ctz(~f.used_bitmap));
const buf: []u8 = &f.bufs[slot];
if (len > buf.len) return null;
f.used_bitmap |= @as(u2, 1) << slot;
f.used_len[slot] = len;
return buf.ptr;
}
fn memSlot(f: *FuzzAllocator, mem: []u8) u1 {
const slot: u1 = if (&mem[0] == &f.bufs[0][0])
0
else if (&mem[0] == &f.bufs[1][0])
1
else
unreachable;
assert((f.used_bitmap >> slot) & 1 == 1);
assert(mem.len == f.used_len[slot]);
return slot;
}
fn resize(ctx: *anyopaque, mem: []u8, a: std.mem.Alignment, new_len: usize, _: usize) bool {
const f: *FuzzAllocator = @ptrCast(@alignCast(ctx));
assert(a == .@"4");
assert(f.allocCount() == 1);
const slot = f.memSlot(mem);
if (new_len > f.bufs[slot].len or f.smith.value(bool)) return false;
f.used_len[slot] = new_len;
return true;
}
fn remap(ctx: *anyopaque, mem: []u8, a: std.mem.Alignment, new_len: usize, _: usize) ?[*]u8 {
const f: *FuzzAllocator = @ptrCast(@alignCast(ctx));
assert(a == .@"4");
assert(f.allocCount() == 1);
const slot = f.memSlot(mem);
if (new_len > f.bufs[slot].len or f.smith.value(bool)) return null;
if (f.smith.value(bool)) {
f.used_len[slot] = new_len;
// remap in place
return mem.ptr;
} else {
// moving remap
const new_slot = ~slot;
f.used_bitmap = ~f.used_bitmap;
f.used_len[new_slot] = new_len;
const new_buf = &f.bufs[new_slot];
@memcpy(new_buf[0..mem.len], mem);
return new_buf.ptr;
}
}
fn free(ctx: *anyopaque, mem: []u8, a: std.mem.Alignment, _: usize) void {
const f: *FuzzAllocator = @ptrCast(@alignCast(ctx));
assert(a == .@"4");
f.used_bitmap ^= @as(u2, 1) << f.memSlot(mem);
}
};
fn fuzzAgainstArrayList(_: void, smith: *std.testing.Smith) anyerror!void {
const testing = std.testing;
const gpa = testing.allocator;
const input = try gpa.alloc(u8, 1024);
defer gpa.free(input);
var prng = std.Random.DefaultPrng.init(testing.random_seed);
prng.random().bytes(input);
try fuzzAgainstArrayList({}, input);
}
fn fuzzAgainstArrayList(_: void, input: []const u8) anyerror!void {
const testing = std.testing;
const gpa = testing.allocator;
var q_gpa_inst: FuzzAllocator = .init(smith);
var l_gpa_buf: [q_gpa_inst.bufs[0].len]u8 align(4) = undefined;
var l_gpa_inst: std.heap.FixedBufferAllocator = .init(&l_gpa_buf);
const q_gpa = q_gpa_inst.allocator();
const l_gpa = l_gpa_inst.allocator();
var q: Deque(u32) = .empty;
defer q.deinit(gpa);
var l: std.ArrayList(u32) = .empty;
defer l.deinit(gpa);
if (input.len < 2) return;
var prng = std.Random.DefaultPrng.init(input[0]);
const random = prng.random();
const Action = enum {
const Action = enum(u8) {
grow,
push_back,
push_front,
pop_back,
pop_front,
grow,
/// Sentinel to avoid hardcoding the cast below
max,
};
for (input[1..]) |byte| {
switch (@as(Action, @enumFromInt(byte % (@intFromEnum(Action.max))))) {
while (!smith.eosWeightedSimple(15, 1)) {
const baseline = testing.Smith.baselineWeights(Action);
const grow_weight: testing.Smith.Weight = .value(Action, .grow, 3);
switch (smith.valueWeighted(Action, baseline ++ .{grow_weight})) {
.push_back => {
const item = random.int(u8);
const item = smith.value(u32);
try testing.expectEqual(
l.appendBounded(item),
q.pushBackBounded(item),
);
},
.push_front => {
const item = random.int(u8);
const item = smith.value(u32);
try testing.expectEqual(
l.insertBounded(0, item),
q.pushFrontBounded(item),
@ -397,11 +481,10 @@ fn fuzzAgainstArrayList(_: void, input: []const u8) anyerror!void {
// ensureTotalCapacityPrecise(), which is the most complex part
// of the Deque implementation.
.grow => {
const growth = random.int(u3);
try l.ensureTotalCapacityPrecise(gpa, l.items.len + growth);
try q.ensureTotalCapacityPrecise(gpa, q.len + growth);
const growth = smith.value(u3);
try l.ensureTotalCapacityPrecise(l_gpa, l.items.len + growth);
try q.ensureTotalCapacityPrecise(q_gpa, q.len + growth);
},
.max => unreachable,
}
try testing.expectEqual(l.getLastOrNull(), q.back());
try testing.expectEqual(
@ -417,5 +500,8 @@ fn fuzzAgainstArrayList(_: void, input: []const u8) anyerror!void {
}
try testing.expectEqual(null, it.next());
}
try testing.expectEqual(@intFromBool(q.buffer.len != 0), q_gpa_inst.allocCount());
}
q.deinit(q_gpa);
try testing.expectEqual(0, q_gpa_inst.allocCount());
}

View file

@ -490,20 +490,3 @@ test isNumberFormattedLikeAnInteger {
try std.testing.expect(!isNumberFormattedLikeAnInteger("1e10"));
try std.testing.expect(!isNumberFormattedLikeAnInteger("1E10"));
}
test "fuzz" {
try std.testing.fuzz({}, fuzzTestOne, .{});
}
fn fuzzTestOne(_: void, input: []const u8) !void {
var buf: [16384]u8 = undefined;
var fba: std.heap.FixedBufferAllocator = .init(&buf);
var scanner = Scanner.initCompleteInput(fba.allocator(), input);
// Property: There are at most input.len tokens
var tokens: usize = 0;
while ((scanner.next() catch return) != .end_of_document) {
tokens += 1;
if (tokens > input.len) return error.Overflow;
}
}

View file

@ -1195,6 +1195,8 @@ pub fn refAllDeclsRecursive(comptime T: type) void {
}
}
pub const Smith = @import("testing/Smith.zig");
pub const FuzzInputOptions = struct {
corpus: []const []const u8 = &.{},
};
@ -1202,7 +1204,7 @@ pub const FuzzInputOptions = struct {
/// Inline to avoid coverage instrumentation.
pub inline fn fuzz(
context: anytype,
comptime testOne: fn (context: @TypeOf(context), input: []const u8) anyerror!void,
comptime testOne: fn (context: @TypeOf(context), smith: *Smith) anyerror!void,
options: FuzzInputOptions,
) anyerror!void {
return @import("root").fuzz(context, testOne, options);
@ -1309,3 +1311,7 @@ pub const ReaderIndirect = struct {
};
}
};
test {
_ = &Smith;
}

895
lib/std/testing/Smith.zig Normal file
View file

@ -0,0 +1,895 @@
//! Used in conjuncation with `std.testing.fuzz` to generate values
const builtin = @import("builtin");
const std = @import("../std.zig");
const assert = std.debug.assert;
const fuzz_abi = std.Build.abi.fuzz;
const Smith = @This();
/// Null if the fuzzer is being used, in which case this struct will not be mutated.
///
/// Intended to be initialized directly.
in: ?[]const u8,
pub const Weight = fuzz_abi.Weight;
fn intUid(hash: u32) fuzz_abi.Uid {
@disableInstrumentation();
return @bitCast(hash << 1);
}
fn bytesUid(hash: u32) fuzz_abi.Uid {
@disableInstrumentation();
return @bitCast(hash | 1);
}
fn Backing(T: type) type {
return @Int(.unsigned, @bitSizeOf(T));
}
fn toExcessK(T: type, x: T) Backing(T) {
return @bitCast(x -% std.math.minInt(T));
}
fn fromExcessK(T: type, x: Backing(T)) T {
return @as(T, @bitCast(x)) +% std.math.minInt(T);
}
fn enumFieldLessThan(_: void, a: std.builtin.Type.EnumField, b: std.builtin.Type.EnumField) bool {
return a.value < b.value;
}
/// Returns an array of weights containing each possible value of `T`.
//
// `inline` to propogate the `comptime`ness of the result
pub inline fn baselineWeights(T: type) []const Weight {
return comptime switch (@typeInfo(T)) {
.bool, .int, .float => i: {
// Reject types that don't have a fixed bitsize (esp. usize)
// since they are not gauraunteed to fit in a u64 across targets.
if (std.mem.indexOfScalar(type, &.{
isize, usize,
c_char, c_longdouble,
c_short, c_ushort,
c_int, c_uint,
c_long, c_ulong,
c_longlong, c_ulonglong,
}, T) != null) {
@compileError("type does not have a fixed bitsize: " ++ @typeName(T));
}
break :i &.{.rangeAtMost(Backing(T), 0, (1 << @bitSizeOf(T)) - 1, 1)};
},
.@"struct" => |s| if (s.backing_integer) |B|
baselineWeights(B)
else
@compileError("non-packed structs cannot be weighted"),
.@"union" => |u| if (u.layout == .@"packed")
baselineWeights(Backing(T))
else
@compileError("non-packed unions cannot be weighted"),
.@"enum" => |e| if (!e.is_exhaustive)
baselineWeights(e.tag_type)
else if (e.fields.len == 0)
// Cannot be included in below branch due to `log2_int_ceil`
@compileError("exhaustive zero-field enums cannot be weighted")
else e: {
@setEvalBranchQuota(@intCast(4 * e.fields.len *
std.math.log2_int_ceil(usize, e.fields.len)));
var sorted_fields = e.fields[0..e.fields.len].*;
std.mem.sortUnstable(std.builtin.Type.EnumField, &sorted_fields, {}, enumFieldLessThan);
var weights: []const Weight = &.{};
var seq_first: u64 = sorted_fields[0].value;
for (sorted_fields[0 .. sorted_fields.len - 1], sorted_fields[1..]) |prev, field| {
if (field.value != prev.value + 1) {
weights = weights ++ .{Weight.rangeAtMost(u64, seq_first, prev.value, 1)};
seq_first = field.value;
}
}
weights = weights ++ .{Weight.rangeAtMost(
u64,
seq_first,
sorted_fields[sorted_fields.len - 1].value,
1,
)};
break :e weights;
},
else => @compileError("unexpected type: " ++ @typeName(T)),
};
}
test baselineWeights {
try std.testing.expectEqualSlices(
Weight,
&.{.rangeAtMost(bool, false, true, 1)},
baselineWeights(bool),
);
try std.testing.expectEqualSlices(
Weight,
&.{.rangeAtMost(u4, 0, 15, 1)},
baselineWeights(u4),
);
try std.testing.expectEqualSlices(
Weight,
&.{.rangeAtMost(u4, 0, 15, 1)},
baselineWeights(i4),
);
try std.testing.expectEqualSlices(
Weight,
&.{.rangeAtMost(u16, 0, 0xffff, 1)},
baselineWeights(f16),
);
try std.testing.expectEqualSlices(
Weight,
&.{.rangeAtMost(u4, 0, 15, 1)},
baselineWeights(packed struct(u4) { _: u4 }),
);
try std.testing.expectEqualSlices(
Weight,
&.{.rangeAtMost(u4, 0, 15, 1)},
baselineWeights(packed union { _: u4 }),
);
try std.testing.expectEqualSlices(
Weight,
&.{.rangeAtMost(u4, 0, 15, 1)},
baselineWeights(enum(u4) { _ }),
);
try std.testing.expectEqualSlices(Weight, &.{
.rangeAtMost(u4, 0, 1, 1),
.value(u4, 3, 1),
.value(u4, 5, 1),
.rangeAtMost(u4, 8, 10, 1),
}, baselineWeights(enum(u4) {
a = 1,
b = 5,
c = 8,
d = 3,
e = 0,
f = 9,
g = 10,
}));
}
fn valueFromInt(T: anytype, int: Backing(T)) T {
@disableInstrumentation();
return switch (@typeInfo(T)) {
.@"enum" => @enumFromInt(int),
else => @bitCast(int),
};
}
fn checkWeights(weights: []const Weight, max_incl: u64) void {
@disableInstrumentation();
const w0 = weights[0]; // Sum of weights is zero
assert(w0.weight != 0);
assert(w0.max <= max_incl);
var incl_sum: u64 = (w0.max - w0.min) * w0.weight + (w0.weight - 1); // Sum of weights greater than 2^64
for (weights[1..]) |w| {
assert(w.weight != 0);
assert(w.max <= max_incl);
// This addition will not overflow except with an illegal combination of weights since
// the exclusive sum must be at least one so a span of all values is impossible.
incl_sum += (w.max - w.min + 1) * w.weight; // Sum of weights greater than 2^64
}
}
// `inline` to propogate callee's unique return address
inline fn firstHash() u32 {
return @truncate(std.hash.int(@returnAddress()));
}
// `noinline` to capture a unique return address
pub noinline fn value(s: *Smith, T: type) T {
@disableInstrumentation();
return s.valueWithHash(T, firstHash());
}
// `noinline` to capture a unique return address
pub noinline fn valueWeighted(s: *Smith, T: type, weights: []const Weight) T {
@disableInstrumentation();
return s.valueWeightedWithHash(T, weights, firstHash());
}
// `noinline` to capture a unique return address
pub noinline fn valueRangeAtMost(s: *Smith, T: type, at_least: T, at_most: T) T {
@disableInstrumentation();
return s.valueRangeAtMostWithHash(T, at_least, at_most, firstHash());
}
// `noinline` to capture a unique return address
pub noinline fn valueRangeLessThan(s: *Smith, T: type, at_least: T, less_than: T) T {
@disableInstrumentation();
return s.valueRangeLessThanWithHash(T, at_least, less_than, firstHash());
}
/// This is similar to `value(bool)` however it is gauraunteed to eventually
/// return `true` and provides the fuzzer with an extra hint about the data.
//
// `noinline` to capture a unique return address
pub noinline fn eos(s: *Smith) bool {
@disableInstrumentation();
return s.eosWithHash(firstHash());
}
/// This is similar to `value(bool)` however it is gauraunteed to eventually
/// return `true` and provides the fuzzer with an extra hint about the data.
///
/// It is asserted that the weight of `true` is non-zero.
//
// `noinline` to capture a unique return address
pub noinline fn eosWeighted(s: *Smith, weights: []const Weight) bool {
@disableInstrumentation();
return s.eosWeightedWithHash(weights, firstHash());
}
/// This is similar to `value(bool)` however it is gauraunteed to eventually
/// return `true` and provides the fuzzer with an extra hint about the data.
///
/// It is asserted that the weight of `true` is non-zero.
//
// `noinline` to capture a unique return address
pub noinline fn eosWeightedSimple(s: *Smith, false_weight: u64, true_weight: u64) bool {
@disableInstrumentation();
return s.eosWeightedSimpleWithHash(false_weight, true_weight, firstHash());
}
// `noinline` to capture a unique return address
pub noinline fn bytes(s: *Smith, out: []u8) void {
@disableInstrumentation();
return s.bytesWithHash(out, firstHash());
}
// `noinline` to capture a unique return address
pub noinline fn bytesWeighted(s: *Smith, out: []u8, weights: []const Weight) void {
@disableInstrumentation();
return s.bytesWeightedWithHash(out, weights, firstHash());
}
/// Returns the length of the filled slice
///
/// It is asserted that `buf.len` fits within a u32
// `noinline` to capture a unique return address
pub noinline fn slice(s: *Smith, buf: []u8) u32 {
@disableInstrumentation();
return s.sliceWithHash(buf, firstHash());
}
/// Returns the length of the filled slice
///
/// It is asserted that `buf.len` fits within a u32
//
// `noinline` to capture a unique return address
pub noinline fn sliceWeightedBytes(s: *Smith, buf: []u8, byte_weights: []const Weight) u32 {
@disableInstrumentation();
return s.sliceWeightedBytesWithHash(buf, byte_weights, firstHash());
}
/// Returns the length of the filled slice
///
/// It is asserted that `buf.len` fits within a u32
//
// `noinline` to capture a unique return address
pub noinline fn sliceWeighted(
s: *Smith,
buf: []u8,
len_weights: []const Weight,
byte_weights: []const Weight,
) u32 {
@disableInstrumentation();
return s.sliceWeightedWithHash(buf, len_weights, byte_weights, firstHash());
}
fn weightsContain(int: u64, weights: []const Weight) bool {
@disableInstrumentation();
var contains: bool = false;
for (weights) |w| {
contains |= w.min <= int and int <= w.max;
}
return contains;
}
/// Asserts `T` can be a member of a packed type
//
// `inline` to propogate the `comptime`ness of the result
inline fn allBitPatternsValid(T: type) bool {
return comptime switch (@typeInfo(T)) {
.void, .bool, .int, .float => true,
inline .@"struct", .@"union" => |c| c.layout == .@"packed" and for (c.fields) |f| {
if (!allBitPatternsValid(f.type)) break false;
} else true,
.@"enum" => |e| !e.is_exhaustive,
else => unreachable,
};
}
test allBitPatternsValid {
try std.testing.expect(allBitPatternsValid(packed struct {
a: void,
b: u8,
c: f16,
d: packed union {
a: u16,
b: i16,
c: f16,
},
e: enum(u4) { _ },
}));
try std.testing.expect(!allBitPatternsValid(packed union {
a: i4,
b: enum(u4) { a },
}));
}
fn UnionTagWithoutUninitializable(T: type) type {
const u = @typeInfo(T).@"union";
const Tag = u.tag_type orelse @compileError("union must have tag");
const e = @typeInfo(Tag).@"enum";
var field_names: [e.fields.len][]const u8 = undefined;
var field_values: [e.fields.len]e.tag_type = undefined;
var n_fields = 0;
for (u.fields) |f| {
switch (f.type) {
noreturn => continue,
else => {},
}
field_names[n_fields] = f.name;
field_values[n_fields] = @intFromEnum(@field(Tag, f.name));
n_fields += 1;
}
return @Enum(e.tag_type, .exhaustive, field_names[0..n_fields], field_values[0..n_fields]);
}
pub fn valueWithHash(s: *Smith, T: type, hash: u32) T {
@disableInstrumentation();
return switch (@typeInfo(T)) {
.void => {},
.bool, .int, .float => full: {
var int: Backing(T) = 0;
comptime var biti = 0;
var rhash = hash; // 'running' hash
inline while (biti < @bitSizeOf(T)) {
const n = @min(@bitSizeOf(T) - biti, 64);
const P = @Int(.unsigned, n);
int |= @as(
@TypeOf(int),
s.valueWeightedWithHash(P, baselineWeights(P), rhash),
) << biti;
biti += n;
rhash = std.hash.int(rhash);
}
break :full @bitCast(int);
},
.@"enum" => |e| if (e.is_exhaustive) v: {
if (@bitSizeOf(e.tag_type) <= 64) {
break :v s.valueWeightedWithHash(T, baselineWeights(T), hash);
}
break :v std.enums.fromInt(T, s.valueWithHash(e.tag_type, hash)) orelse
@enumFromInt(e.fields[0].value);
} else @enumFromInt(s.valueWithHash(e.tag_type, hash)),
.optional => |o| if (s.valueWithHash(bool, hash))
null
else
s.valueWithHash(o.child, std.hash.int(hash)),
inline .array, .vector => |a| arr: {
var arr: [a.len]a.child = undefined; // `T` cannot be used due to the vector case
if (a.child != u8) {
for (&arr) |*v| {
v.* = s.valueWithHash(a.child, hash);
}
} else {
s.bytesWithHash(&arr, hash);
}
break :arr arr;
},
.@"struct" => |st| if (!allBitPatternsValid(T)) v: {
var v: T = undefined;
var rhash = hash;
inline for (st.fields) |f| {
// rhash is incremented in the call so our rhash state is not reused (e.g. with
// two nested structs. note that xor cannot work for this case as the bit would
// be flipped back here)
@field(v, f.name) = s.valueWithHash(f.type, rhash +% 1);
rhash = std.hash.int(rhash);
}
break :v v;
} else @bitCast(s.valueWithHash(st.backing_integer.?, hash)),
.@"union" => if (!allBitPatternsValid(T))
switch (s.valueWithHash(
UnionTagWithoutUninitializable(T),
// hash is incremented in the call so our hash state is not reused for below
std.hash.int(hash +% 1),
)) {
inline else => |t| @unionInit(
T,
@tagName(t),
s.valueWithHash(@FieldType(T, @tagName(t)), hash),
),
}
else
@bitCast(s.valueWithHash(Backing(T), hash)),
else => @compileError("unexpected type '" ++ @typeName(T) ++ "'"),
};
}
pub fn valueWeightedWithHash(s: *Smith, T: type, weights: []const Weight, hash: u32) T {
@disableInstrumentation();
checkWeights(weights, (1 << @bitSizeOf(T)) - 1);
return valueFromInt(T, @intCast(s.valueWeightedWithHashInner(weights, hash)));
}
fn valueWeightedWithHashInner(s: *Smith, weights: []const Weight, hash: u32) u64 {
@disableInstrumentation();
return if (s.in) |*in| int: {
if (in.len < 8) {
@branchHint(.unlikely);
in.* = &.{};
break :int weights[0].min;
}
const int = std.mem.readInt(u64, in.*[0..8], .little);
in.* = in.*[8..];
break :int if (weightsContain(int, weights)) int else weights[0].min;
} else if (builtin.fuzz) int: {
@branchHint(.likely);
break :int fuzz_abi.fuzzer_int(intUid(hash), .fromSlice(weights));
} else unreachable;
}
pub fn valueRangeAtMostWithHash(s: *Smith, T: type, at_least: T, at_most: T, hash: u32) T {
@disableInstrumentation();
if (@typeInfo(T) == .int and @typeInfo(T).int.signedness == .signed) {
return fromExcessK(T, s.valueRangeAtMostWithHash(
Backing(T),
toExcessK(T, at_least),
toExcessK(T, at_most),
hash,
));
}
return s.valueWeightedWithHash(T, &.{.rangeAtMost(T, at_least, at_most, 1)}, hash);
}
pub fn valueRangeLessThanWithHash(s: *Smith, T: type, at_least: T, less_than: T, hash: u32) T {
@disableInstrumentation();
if (@typeInfo(T) == .int and @typeInfo(T).int.signedness == .signed) {
return fromExcessK(T, s.valueRangeLessThanWithHash(
Backing(T),
toExcessK(T, at_least),
toExcessK(T, less_than),
hash,
));
}
return s.valueWeightedWithHash(T, &.{.rangeLessThan(T, at_least, less_than, 1)}, hash);
}
/// This is similar to `value(bool)` however it is gauraunteed to eventually
/// return `true` and provides the fuzzer with an extra hint about the data.
pub fn eosWithHash(s: *Smith, hash: u32) bool {
@disableInstrumentation();
return s.eosWeightedWithHash(baselineWeights(bool), hash);
}
/// This is similar to `value(bool)` however it is gauraunteed to eventually
/// return `true` and provides the fuzzer with an extra hint about the data.
///
/// It is asserted that the weight of `true` is non-zero.
pub fn eosWeightedWithHash(s: *Smith, weights: []const Weight, hash: u32) bool {
@disableInstrumentation();
checkWeights(weights, 1);
for (weights) |w| (if (w.max == 1) break) else unreachable; // `true` must have non-zero weight
if (s.in) |*in| {
if (in.len == 0) {
@branchHint(.unlikely);
return true;
}
const eos_val = in.*[0] != 0;
in.* = in.*[1..];
return eos_val or b: {
var only_true: bool = true;
for (weights) |w| {
only_true &= @as(u1, @intCast(w.min)) == 1;
}
break :b only_true;
};
} else if (builtin.fuzz) {
@branchHint(.likely);
return fuzz_abi.fuzzer_eos(intUid(hash), .fromSlice(weights));
} else unreachable;
}
/// This is similar to `value(bool)` however it is gauraunteed to eventually
/// return `true` and provides the fuzzer with an extra hint about the data.
///
/// It is asserted that the weight of `false` is non-zero.
/// It is asserted that the weight of `true` is non-zero.
//
// `noinline` to capture a unique return address
pub fn eosWeightedSimpleWithHash(s: *Smith, false_weight: u64, true_weight: u64, hash: u32) bool {
@disableInstrumentation();
return s.eosWeightedWithHash(&.{
.value(bool, false, false_weight),
.value(bool, true, true_weight),
}, hash);
}
pub fn bytesWithHash(s: *Smith, out: []u8, hash: u32) void {
@disableInstrumentation();
return s.bytesWeightedWithHash(out, baselineWeights(u8), hash);
}
pub fn bytesWeightedWithHash(s: *Smith, out: []u8, weights: []const Weight, hash: u32) void {
@disableInstrumentation();
checkWeights(weights, 255);
if (s.in) |*in| {
var present_weights: [256]bool = @splat(false);
for (weights) |w| {
@memset(present_weights[@intCast(w.min)..@intCast(w.max + 1)], true);
}
const default: u8 = @intCast(weights[0].min);
const copy_len = @min(out.len, in.len);
for (in.*[0..copy_len], out[0..copy_len]) |i, *o| {
o.* = if (present_weights[i]) i else default;
}
in.* = in.*[copy_len..];
@memset(out[copy_len..], default);
} else if (builtin.fuzz) {
@branchHint(.likely);
fuzz_abi.fuzzer_bytes(bytesUid(hash), .fromSlice(out), .fromSlice(weights));
} else unreachable;
}
/// Returns the length of the filled slice
///
/// It is asserted that `buf.len` fits within a u32
pub fn sliceWithHash(s: *Smith, buf: []u8, hash: u32) u32 {
@disableInstrumentation();
return s.sliceWeightedBytesWithHash(buf, baselineWeights(u8), hash);
}
/// Returns the length of the filled slice
///
/// It is asserted that `buf.len` fits within a u32
pub fn sliceWeightedBytesWithHash(
s: *Smith,
buf: []u8,
byte_weights: []const Weight,
hash: u32,
) u32 {
@disableInstrumentation();
return s.sliceWeightedWithHash(
buf,
&.{.rangeAtMost(u32, 0, @intCast(buf.len), 1)},
byte_weights,
hash,
);
}
/// Returns the length of the filled slice
///
/// It is asserted that `buf.len` fits within a u32
pub fn sliceWeightedWithHash(
s: *Smith,
buf: []u8,
len_weights: []const Weight,
byte_weights: []const Weight,
hash: u32,
) u32 {
@disableInstrumentation();
checkWeights(byte_weights, 255);
checkWeights(len_weights, @as(u32, @intCast(buf.len)));
if (s.in) |*in| {
const in_len = len: {
if (in.len < 4) {
@branchHint(.unlikely);
in.* = &.{};
break :len 0;
}
const len = std.mem.readInt(u32, in.*[0..4], .little);
in.* = in.*[4..];
break :len @min(len, in.len);
};
const out_len: u32 = if (weightsContain(in_len, len_weights))
in_len
else
@intCast(len_weights[0].min);
var present_weights: [256]bool = @splat(false);
for (byte_weights) |w| {
@memset(present_weights[@intCast(w.min)..@intCast(w.max + 1)], true);
}
const default: u8 = @intCast(byte_weights[0].min);
const copy_len = @min(out_len, in_len);
for (in.*[0..copy_len], buf[0..copy_len]) |i, *o| {
o.* = if (present_weights[i]) i else default;
}
in.* = in.*[in_len..];
@memset(buf[copy_len..], default);
return out_len;
} else if (builtin.fuzz) {
@branchHint(.likely);
return fuzz_abi.fuzzer_slice(
bytesUid(hash),
.fromSlice(buf),
.fromSlice(len_weights),
.fromSlice(byte_weights),
);
} else unreachable;
}
fn constructInput(comptime values: []const union(enum) {
eos: bool,
int: u64,
bytes: []const u8,
slice: []const u8,
}) []const u8 {
const result = comptime result: {
var result: [
len: {
var len = 0;
for (values) |v| len += switch (v) {
.eos => 1,
.int => 8,
.bytes => |b| b.len,
.slice => |s| 4 + s.len,
};
break :len len;
}
]u8 = undefined;
var w: std.Io.Writer = .fixed(&result);
for (values) |v| switch (v) {
.eos => |e| w.writeByte(@intFromBool(e)) catch unreachable,
.int => |i| w.writeInt(u64, i, .little) catch unreachable,
.bytes => |b| w.writeAll(b) catch unreachable,
.slice => |s| {
w.writeInt(u32, @intCast(s.len), .little) catch unreachable;
w.writeAll(s) catch unreachable;
},
};
break :result result;
};
return &result;
}
test value {
if (@import("builtin").zig_backend == .stage2_c) return error.SkipZigTest; // TODO
const S = struct {
v: void = {},
b: bool = true,
ih: u16 = 123,
iq: u64 = 55555,
io: u128 = (1 << 80) | (1 << 23),
fd: f64 = std.math.pi,
ft: f80 = std.math.e,
eh: enum(u16) { a, _ } = @enumFromInt(999),
eo: enum(u128) { a, b, _ } = .b,
aw: [3]u32 = .{ 1 << 30, 1 << 20, 1 << 10 },
vw: @Vector(3, u32) = .{ 1 << 10, 1 << 20, 1 << 30 },
ab: [3]u8 = .{ 55, 33, 88 },
vb: @Vector(3, u8) = .{ 22, 44, 99 },
s: struct { q: u64 } = .{ .q = 1 },
sz: struct {} = .{},
sp: packed struct(u8) { a: u5, b: u3 } = .{ .a = 31, .b = 3 },
si: packed struct(u8) { a: u5, b: enum(u3) { a, b } } = .{ .a = 15, .b = .b },
u: union(enum(u2)) {
a: u64,
b: u64,
c: noreturn,
} = .{ .b = 777777 },
up: packed union {
a: u16,
b: f16,
} = .{ .b = std.math.phi },
invalid: struct {
ib: u8 = 0,
eb: enum(u8) { a, b } = .a,
eo: enum(u128) { a, b } = .a,
u: union(enum(u1)) { a: noreturn, b: void } = .{ .b = {} },
} = .{},
};
const s: S = .{};
const ft_bits: u80 = @bitCast(s.ft);
const eo_bits = @intFromEnum(s.eo);
var smith: Smith = .{
.in = constructInput(&.{
// v
.{ .int = @intFromBool(s.b) }, // b
.{ .int = s.ih }, // ih
.{ .int = s.iq }, // iq
.{ .int = @truncate(s.io) }, .{ .int = @intCast(s.io >> 64) }, // io
.{ .int = @bitCast(s.fd) }, // fd
.{ .int = @truncate(ft_bits) }, .{ .int = @intCast(ft_bits >> 64) }, // ft
.{ .int = @intFromEnum(s.eh) }, // eh
.{ .int = @truncate(eo_bits) }, .{ .int = @intCast(eo_bits >> 64) }, // eo
.{ .int = s.aw[0] }, .{ .int = s.aw[1] }, .{ .int = s.aw[2] }, // aw
.{ .int = s.vw[0] }, .{ .int = s.vw[1] }, .{ .int = s.vw[2] }, // vw
.{ .bytes = &s.ab }, // ab
.{ .bytes = &@as([3]u8, s.vb) }, // vb
.{ .int = s.s.q }, // s.q
//sz
.{ .int = @as(u8, @bitCast(s.sp)) }, // sp
.{ .int = s.si.a }, .{ .int = @intFromEnum(s.si.b) }, // si
.{ .int = @intFromEnum(s.u) }, .{ .int = s.u.b }, // u
.{ .int = @as(u16, @bitCast(s.up)) }, // up
// invalid values
.{ .int = 555 }, // invalid.ib
.{ .int = 123 }, // invalid.eb
.{ .int = 0 }, .{ .int = 1 }, // invalid.eo
.{ .int = 0 }, // invalid.u
}),
};
try std.testing.expectEqual(s, smith.value(S));
}
test valueWeighted {
var smith: Smith = .{
.in = constructInput(&.{
.{ .int = 200 },
.{ .int = 200 },
.{ .int = 300 },
.{ .int = 400 },
}),
};
try std.testing.expectEqual(200, smith.valueWeighted(u8, &.{.rangeAtMost(u8, 50, 200, 1)}));
try std.testing.expectEqual(50, smith.valueWeighted(u8, &.{.rangeLessThan(u8, 50, 200, 1)}));
const E = enum(u64) { a = 100, b = 200, c = 300 };
try std.testing.expectEqual(E.c, smith.valueWeighted(E, baselineWeights(E)));
try std.testing.expectEqual(E.a, smith.valueWeighted(E, baselineWeights(E)));
try std.testing.expectEqual(12345, smith.valueWeighted(u64, &.{.value(u64, 12345, 1)}));
}
test valueRangeAtMost {
var smith: Smith = .{
.in = constructInput(&.{
.{ .int = 100 },
.{ .int = 100 },
.{ .int = 200 },
.{ .int = 100 },
.{ .int = 200 },
.{ .int = 0 },
}),
};
try std.testing.expectEqual(100, smith.valueRangeAtMost(u8, 0, 250));
try std.testing.expectEqual(100, smith.valueRangeAtMost(u8, 100, 100));
try std.testing.expectEqual(0, smith.valueRangeAtMost(u8, 0, 100));
try std.testing.expectEqual(100 - 128, smith.valueRangeAtMost(i8, -100, 100));
try std.testing.expectEqual(200 - 128, smith.valueRangeAtMost(i8, -100, 100));
try std.testing.expectEqual(-100, smith.valueRangeAtMost(i8, -100, 100));
}
test valueRangeLessThan {
var smith: Smith = .{
.in = constructInput(&.{
.{ .int = 100 },
.{ .int = 100 },
.{ .int = 100 },
.{ .int = 100 + 128 },
}),
};
try std.testing.expectEqual(100, smith.valueRangeLessThan(u8, 0, 250));
try std.testing.expectEqual(0, smith.valueRangeLessThan(u8, 0, 100));
try std.testing.expectEqual(100 - 128, smith.valueRangeLessThan(i8, -100, 100));
try std.testing.expectEqual(-100, smith.valueRangeLessThan(i8, -100, 100));
}
test eos {
var smith: Smith = .{
.in = constructInput(&.{
.{ .eos = false },
.{ .eos = true },
}),
};
try std.testing.expect(!smith.eos());
try std.testing.expect(smith.eos());
try std.testing.expect(smith.eos());
}
test eosWeighted {
var smith: Smith = .{ .in = constructInput(&.{.{ .eos = false }}) };
try std.testing.expect(smith.eosWeighted(&.{.value(bool, true, std.math.maxInt(u64))}));
}
test bytes {
var smith: Smith = .{ .in = constructInput(&.{
.{ .bytes = "testing!" },
.{ .bytes = "ab" },
}) };
var buf: [8]u8 = undefined;
smith.bytes(&buf);
try std.testing.expectEqualSlices(u8, "testing!", &buf);
smith.bytes(buf[0..0]);
smith.bytes(buf[0..3]);
try std.testing.expectEqualSlices(u8, "ab\x00", buf[0..3]);
}
test bytesWeighted {
var smith: Smith = .{ .in = constructInput(&.{
.{ .bytes = "testing!" },
.{ .bytes = "ab" },
}) };
const weights: []const Weight = &.{.rangeAtMost(u8, 'a', 'z', 1)};
var buf: [8]u8 = undefined;
smith.bytesWeighted(&buf, weights);
try std.testing.expectEqualSlices(u8, "testinga", &buf);
smith.bytesWeighted(buf[0..0], weights);
smith.bytesWeighted(buf[0..3], weights);
try std.testing.expectEqualSlices(u8, "aba", buf[0..3]);
}
test slice {
var smith: Smith = .{
.in = constructInput(&.{
.{ .slice = "testing!" },
.{ .slice = "" },
.{ .slice = "ab" },
.{ .bytes = std.mem.asBytes(&std.mem.nativeToLittle(u32, 4)) }, // length past end
}),
};
var buf: [8]u8 = undefined;
try std.testing.expectEqualSlices(u8, "testing!", buf[0..smith.slice(&buf)]);
try std.testing.expectEqualSlices(u8, "", buf[0..smith.slice(&buf)]);
try std.testing.expectEqualSlices(u8, "ab", buf[0..smith.slice(&buf)]);
try std.testing.expectEqualSlices(u8, "", buf[0..smith.slice(&buf)]);
}
test sliceWeightedBytes {
const weights: []const Weight = &.{.rangeAtMost(u8, 'a', 'z', 1)};
var smith: Smith = .{ .in = constructInput(&.{
.{ .slice = "testing!" },
}) };
var buf: [8]u8 = undefined;
try std.testing.expectEqualSlices(
u8,
"testinga",
buf[0..smith.sliceWeightedBytes(&buf, weights)],
);
try std.testing.expectEqualSlices(u8, "", buf[0..smith.sliceWeightedBytes(&buf, weights)]);
}
test sliceWeighted {
const len_weights: []const Weight = &.{.rangeAtMost(u8, 3, 6, 1)};
const weights: []const Weight = &.{.rangeAtMost(u8, 'a', 'z', 1)};
var smith: Smith = .{ .in = constructInput(&.{
.{ .slice = "testing!" },
.{ .slice = "ing!" },
.{ .slice = "ab" },
}) };
var buf: [8]u8 = undefined;
try std.testing.expectEqualSlices(
u8,
"tes",
buf[0..smith.sliceWeighted(&buf, len_weights, weights)],
);
try std.testing.expectEqualSlices(
u8,
"inga",
buf[0..smith.sliceWeighted(&buf, len_weights, weights)],
);
try std.testing.expectEqualSlices(
u8,
"aba",
buf[0..smith.sliceWeighted(&buf, len_weights, weights)],
);
try std.testing.expectEqualSlices(
u8,
"aaa",
buf[0..smith.sliceWeighted(&buf, len_weights, weights)],
);
}

View file

@ -14,6 +14,7 @@ pub const Server = @import("zig/Server.zig");
pub const Client = @import("zig/Client.zig");
pub const Token = tokenizer.Token;
pub const Tokenizer = tokenizer.Tokenizer;
pub const TokenSmith = @import("zig/TokenSmith.zig");
pub const string_literal = @import("zig/string_literal.zig");
pub const number_literal = @import("zig/number_literal.zig");
pub const primitives = @import("zig/primitives.zig");
@ -987,6 +988,7 @@ test {
_ = LibCDirs;
_ = LibCInstallation;
_ = Server;
_ = TokenSmith;
_ = WindowsSdk;
_ = number_literal;
_ = primitives;

View file

@ -160,10 +160,21 @@ pub fn parse(gpa: Allocator, source: [:0]const u8, mode: Mode) Allocator.Error!A
if (token.tag == .eof) break;
}
var tokens_slice = tokens.toOwnedSlice();
errdefer tokens_slice.deinit(gpa);
return parseTokens(gpa, source, tokens_slice, mode);
}
pub fn parseTokens(
gpa: Allocator,
source: [:0]const u8,
tokens: Ast.TokenList.Slice,
mode: Mode,
) Allocator.Error!Ast {
var parser: Parse = .{
.source = source,
.gpa = gpa,
.tokens = tokens.slice(),
.tokens = tokens,
.errors = .{},
.nodes = .{},
.extra_data = .{},
@ -194,7 +205,7 @@ pub fn parse(gpa: Allocator, source: [:0]const u8, mode: Mode) Allocator.Error!A
return Ast{
.source = source,
.mode = mode,
.tokens = tokens.toOwnedSlice(),
.tokens = tokens,
.nodes = parser.nodes.toOwnedSlice(),
.extra_data = extra_data,
.errors = errors,

277
lib/std/zig/TokenSmith.zig Normal file
View file

@ -0,0 +1,277 @@
//! Generates a list of tokens and a valid corresponding source.
//! Smithed intertoken content is a non-goal of this.
const std = @import("../std.zig");
const Smith = std.testing.Smith;
const Token = std.zig.Token;
const TokenList = std.zig.Ast.TokenList;
const TokenSmith = @This();
source_buf: [4096]u8,
source_len: u32,
tag_buf: [512]Token.Tag,
start_buf: [512]std.zig.Ast.ByteOffset,
tags_len: u16,
fn symbolLenWeights(t: *TokenSmith, min: u32, reserve: u32) [2]Smith.Weight {
@disableInstrumentation();
const space = @as(u32, t.source_buf.len - 1) - t.source_len - reserve;
std.debug.assert(space >= 15);
return .{
.rangeAtMost(u32, min, space, 1),
.rangeAtMost(u32, min, 15, space),
};
}
pub fn gen(smith: *Smith) TokenSmith {
@disableInstrumentation();
var t: TokenSmith = .{
.source_buf = undefined,
.source_len = 0,
.tag_buf = undefined,
.start_buf = undefined,
.tags_len = 0,
};
const max_lexeme_len = comptime max: {
var max: usize = 0;
for (std.meta.tags(Token.Tag)) |tag| {
max = @max(max, if (tag.lexeme()) |s| s.len else 0);
}
break :max max;
} + 1; // + space
const symbol_reserved = 15 + 4; // 4 = doc comment: "///\n"
const max_output_bytes = @max(symbol_reserved, max_lexeme_len);
while (t.tags_len + 2 < t.tag_buf.len - 1 and
t.source_len + max_output_bytes < t.source_buf.len - 1 and
!smith.eosWeightedSimple(7, 1))
{
const tag = smith.value(Token.Tag);
if (tag == .eof) continue;
t.tag_buf[t.tags_len] = tag;
t.start_buf[t.tags_len] = t.source_len;
t.tags_len += 1;
if (tag.lexeme()) |lexeme| {
@memcpy(t.source_buf[t.source_len..][0..lexeme.len], lexeme);
t.source_len += @intCast(lexeme.len);
if (tag == .invalid_periodasterisks) {
t.tag_buf[t.tags_len] = .asterisk;
t.start_buf[t.tags_len] = t.source_len - 1;
t.tags_len += 1;
}
t.source_buf[t.source_len] = '\n';
t.source_len += 1;
} else sw: switch (tag) {
.invalid => {
// While their are multiple ways invalid may be hit,
// it is unlikely the source will be inspected.
t.source_buf[t.source_len] = 0;
t.source_len += 1;
},
.identifier => {
const start = smith.valueWeighted(u8, &.{
.rangeAtMost(u8, 'a', 'z', 1),
.rangeAtMost(u8, '@', 'Z', 1), // @, A...Z
.value(u8, '_', 1),
});
t.source_buf[t.source_len] = start;
t.source_len += 1;
if (start == '@') continue :sw .string_literal;
const len_weights = t.symbolLenWeights(0, 1);
const len = smith.sliceWeighted(
t.source_buf[t.source_len..],
&len_weights,
&.{
.rangeAtMost(u8, 'a', 'z', 1),
.rangeAtMost(u8, 'A', 'Z', 1),
.rangeAtMost(u8, '0', '9', 1),
.value(u8, '_', 1),
},
);
if (Token.getKeyword(t.source_buf[t.source_len - 1 ..][0 .. len + 1]) != null) {
t.source_buf[t.source_len - 1] = '_';
}
t.source_len += len;
t.source_buf[t.source_len] = '\n';
t.source_len += 1;
},
.char_literal, .string_literal => |kind| {
const end: u8 = switch (kind) {
.char_literal => '\'',
.string_literal => '"',
else => unreachable,
};
t.source_buf[t.source_len] = end;
t.source_len += 1;
const len_weights = t.symbolLenWeights(0, 2);
const len = smith.sliceWeighted(
t.source_buf[t.source_len..],
&len_weights,
&.{
.rangeAtMost(u8, 0x20, 0x7e, 1),
.value(u8, '\\', 15),
},
);
var start_escape = false;
for (t.source_buf[t.source_len..][0..len]) |*c| {
if (!start_escape and c.* == end) c.* = ' ';
start_escape = !start_escape and c.* == '\\';
}
if (start_escape) t.source_buf[t.source_len..][len - 1] = ' ';
t.source_len += len;
t.source_buf[t.source_len] = end;
t.source_buf[t.source_len + 1] = '\n';
t.source_len += 2;
},
.multiline_string_literal_line => {
t.source_buf[t.source_len..][0..2].* = @splat('\\');
t.source_len += 2;
const len_weights = t.symbolLenWeights(0, 1);
t.source_len += smith.sliceWeighted(
t.source_buf[t.source_len..],
&len_weights,
&.{.rangeAtMost(u8, 0x20, 0x7e, 1)},
);
t.source_buf[t.source_len] = '\n';
t.source_len += 1;
},
.number_literal => {
t.source_buf[t.source_len] = smith.valueRangeAtMost(u8, '0', '9');
t.source_len += 1;
const len_weights = t.symbolLenWeights(0, 1);
const len = smith.sliceWeighted(
t.source_buf[t.source_len..],
&len_weights,
&.{
.rangeAtMost(u8, '0', '9', 8),
.rangeAtMost(u8, 'a', 'z', 1),
.rangeAtMost(u8, 'A', 'Z', 1),
.value(u8, '+', 1),
.rangeAtMost(u8, '-', '.', 1), // -, .
},
);
var no_period = false;
var not_exponent = true;
for (t.source_buf[t.source_len..][0..len], 0..) |*c, i| {
const invalid_period = no_period and c.* == '.' or i + 1 == len;
const is_exponent = c.* == '-' or c.* == '+';
const invalid_exponent = not_exponent and is_exponent;
const valid_exponent = !not_exponent and is_exponent;
if (invalid_period or invalid_exponent) c.* = '0';
no_period |= c.* == '.' or valid_exponent;
not_exponent = switch (c.*) {
'e', 'E', 'p', 'P' => false,
else => true,
};
}
t.source_len += len;
t.source_buf[t.source_len] = '\n';
t.source_len += 1;
},
.builtin => {
t.source_buf[t.source_len] = '@';
t.source_len += 1;
const len_weights = t.symbolLenWeights(1, 1);
const len = smith.sliceWeighted(
t.source_buf[t.source_len..],
&len_weights,
&.{
.rangeAtMost(u8, 'a', 'z', 1),
.rangeAtMost(u8, 'A', 'Z', 1),
.rangeAtMost(u8, '0', '9', 1),
.value(u8, '_', 1),
},
);
if (t.source_buf[t.source_len] >= '0' and t.source_buf[t.source_len] <= '9') {
t.source_buf[t.source_len] = '_';
}
t.source_len += len;
t.source_buf[t.source_len] = '\n';
t.source_len += 1;
},
.doc_comment, .container_doc_comment => |kind| {
t.source_buf[t.source_len..][0..2].* = "//".*;
t.source_buf[t.source_len..][2] = switch (kind) {
.doc_comment => '/',
.container_doc_comment => '!',
else => unreachable,
};
t.source_len += 3;
const len_weights = t.symbolLenWeights(0, 1);
const len = smith.sliceWeighted(
t.source_buf[t.source_len..],
&len_weights,
&.{
.rangeAtMost(u8, 0x20, 0x7e, 1),
.rangeAtMost(u8, 0x80, 0xff, 1),
},
);
if (kind == .doc_comment and len != 0 and t.source_buf[t.source_len] == '/') {
t.source_buf[t.source_len] = ' ';
}
t.source_len += len;
t.source_buf[t.source_len] = '\n';
t.source_len += 1;
},
else => unreachable,
}
}
t.tag_buf[t.tags_len] = .eof;
t.start_buf[t.tags_len] = t.source_len;
t.tags_len += 1;
t.source_buf[t.source_len] = 0;
return t;
}
pub fn source(t: *TokenSmith) [:0]u8 {
return t.source_buf[0..t.source_len :0];
}
/// The Slice is not backed by a MultiArrayList, so calling deinit or toMultiArrayList is illegal.
pub fn list(t: *TokenSmith) TokenList.Slice {
var slice: TokenList.Slice = .{
.ptrs = undefined,
.len = t.tags_len,
.capacity = t.tags_len,
};
comptime std.debug.assert(slice.ptrs.len == 2);
slice.ptrs[@intFromEnum(TokenList.Field.tag)] = @ptrCast(&t.tag_buf);
slice.ptrs[@intFromEnum(TokenList.Field.start)] = @ptrCast(&t.start_buf);
return slice;
}
test TokenSmith {
try std.testing.fuzz({}, checkSource, .{});
}
fn checkSource(_: void, smith: *Smith) !void {
var t: TokenSmith = .gen(smith);
try std.testing.expectEqual(Token.Tag.eof, t.tag_buf[t.tags_len - 1]);
var tokenizer: std.zig.Tokenizer = .init(t.source());
for (t.tag_buf[0..t.tags_len], t.start_buf[0..t.tags_len]) |tag, start| {
const tok = tokenizer.next();
try std.testing.expectEqual(tok.tag, tag);
try std.testing.expectEqual(tok.loc.start, start);
if (tag == .invalid) break;
}
}

View file

@ -6466,14 +6466,9 @@ test "fuzz ast parse" {
try std.testing.fuzz({}, fuzzTestOneParse, .{});
}
fn fuzzTestOneParse(_: void, input: []const u8) !void {
// The first byte holds if zig / zon
if (input.len == 0) return;
const mode: std.zig.Ast.Mode = if (input[0] & 1 == 0) .zig else .zon;
const bytes = input[1..];
fn fuzzTestOneParse(_: void, smith: *std.testing.Smith) !void {
const mode = smith.value(std.zig.Ast.Mode);
var tokens: std.zig.TokenSmith = .gen(smith);
var fba: std.heap.FixedBufferAllocator = .init(&fixed_buffer_mem);
const allocator = fba.allocator();
const source = allocator.dupeZ(u8, bytes) catch return;
_ = std.zig.Ast.parse(allocator, source, mode) catch return;
_ = std.zig.Ast.parseTokens(fba.allocator(), tokens.source(), tokens.list(), mode) catch return;
}

View file

@ -713,6 +713,9 @@ pub const Tokenizer = struct {
self.index += 1;
switch (self.buffer[self.index]) {
0, '\n' => result.tag = .invalid,
0x01...0x09, 0x0b...0x1f, 0x7f => {
continue :state .invalid;
},
else => continue :state .string_literal,
}
},
@ -1721,15 +1724,22 @@ fn testTokenize(source: [:0]const u8, expected_token_tags: []const Token.Tag) !v
try std.testing.expectEqual(source.len, last_token.loc.end);
}
fn testPropertiesUpheld(_: void, source: []const u8) !void {
var source0_buf: [512]u8 = undefined;
if (source.len + 1 > source0_buf.len)
return;
@memcpy(source0_buf[0..source.len], source);
source0_buf[source.len] = 0;
const source0 = source0_buf[0..source.len :0];
fn testPropertiesUpheld(_: void, smith: *std.testing.Smith) !void {
@disableInstrumentation();
var source_buf: [512]u8 = undefined;
const len = smith.sliceWeightedBytes(source_buf[0 .. source_buf.len - 1], &.{
.rangeAtMost(u8, 0x00, 0xff, 1),
.rangeAtMost(u8, 0x20, 0x7e, 4),
.rangeAtMost(u8, 0x00, 0x1f, 1),
.value(u8, 0, 6),
.value(u8, ' ', 6),
.rangeAtMost(u8, '\t', '\n', 6), // \t, \n
.value(u8, '\r', 3),
});
source_buf[len] = 0;
const source = source_buf[0..len :0];
var tokenizer = Tokenizer.init(source0);
var tokenizer = Tokenizer.init(source);
var tokenization_failed = false;
while (true) {
const token = tokenizer.next();
@ -1742,12 +1752,12 @@ fn testPropertiesUpheld(_: void, source: []const u8) !void {
tokenization_failed = true;
// Property: invalid token always ends at newline or eof
try std.testing.expect(source0[token.loc.end] == '\n' or source0[token.loc.end] == 0);
try std.testing.expect(source[token.loc.end] == '\n' or source[token.loc.end] == 0);
},
.eof => {
// Property: EOF token is always 0-length at end of source.
try std.testing.expectEqual(source0.len, token.loc.start);
try std.testing.expectEqual(source0.len, token.loc.end);
try std.testing.expectEqual(source.len, token.loc.start);
try std.testing.expectEqual(source.len, token.loc.end);
break;
},
else => continue,
@ -1755,7 +1765,7 @@ fn testPropertiesUpheld(_: void, source: []const u8) !void {
}
if (tokenization_failed) return;
for (source0) |cur| {
for (source) |cur| {
// Property: No null byte allowed except at end.
if (cur == 0) {
return error.TestUnexpectedResult;

View file

@ -1112,7 +1112,7 @@ pub const Object = struct {
// needs to for better fuzzing logic.
.IndirectCalls = false,
.TraceBB = false,
.TraceCmp = options.fuzz,
.TraceCmp = false,
.TraceDiv = false,
.TraceGep = false,
.Use8bitCounters = false,

View file

@ -2,9 +2,7 @@ const std = @import("std");
const abi = std.Build.abi.fuzz;
const native_endian = @import("builtin").cpu.arch.endian();
fn testOne(in: abi.Slice) callconv(.c) void {
std.debug.assertReadable(in.toSlice());
}
fn testOne() callconv(.c) void {}
pub fn main() !void {
var debug_gpa_ctx: std.heap.DebugAllocator(.{}) = .init;
@ -24,7 +22,7 @@ pub fn main() !void {
defer cache_dir.close();
abi.fuzzer_init(.fromSlice(cache_dir_path));
abi.fuzzer_init_test(testOne, .fromSlice("test"));
abi.fuzzer_set_test(testOne, .fromSlice("test"));
abi.fuzzer_new_input(.fromSlice(""));
abi.fuzzer_new_input(.fromSlice("hello"));