Zig deflate compression/decompression implementation. It supports compression and decompression of gzip, zlib and raw deflate format.
Fixes#18062.
This PR replaces current compress/gzip and compress/zlib packages. Deflate package is renamed to flate. Flate is common name for deflate/inflate where deflate is compression and inflate decompression.
There are breaking change. Methods signatures are changed because of removal of the allocator, and I also unified API for all three namespaces (flate, gzip, zlib).
Currently I put old packages under v1 namespace they are still available as compress/v1/gzip, compress/v1/zlib, compress/v1/deflate. Idea is to give users of the current API little time to postpone analyzing what they had to change. Although that rises question when it is safe to remove that v1 namespace.
Here is current API in the compress package:
```Zig
// deflate
fn compressor(allocator, writer, options) !Compressor(@TypeOf(writer))
fn Compressor(comptime WriterType) type
fn decompressor(allocator, reader, null) !Decompressor(@TypeOf(reader))
fn Decompressor(comptime ReaderType: type) type
// gzip
fn compress(allocator, writer, options) !Compress(@TypeOf(writer))
fn Compress(comptime WriterType: type) type
fn decompress(allocator, reader) !Decompress(@TypeOf(reader))
fn Decompress(comptime ReaderType: type) type
// zlib
fn compressStream(allocator, writer, options) !CompressStream(@TypeOf(writer))
fn CompressStream(comptime WriterType: type) type
fn decompressStream(allocator, reader) !DecompressStream(@TypeOf(reader))
fn DecompressStream(comptime ReaderType: type) type
// xz
fn decompress(allocator: Allocator, reader: anytype) !Decompress(@TypeOf(reader))
fn Decompress(comptime ReaderType: type) type
// lzma
fn decompress(allocator, reader) !Decompress(@TypeOf(reader))
fn Decompress(comptime ReaderType: type) type
// lzma2
fn decompress(allocator, reader, writer !void
// zstandard:
fn DecompressStream(ReaderType, options) type
fn decompressStream(allocator, reader) DecompressStream(@TypeOf(reader), .{})
struct decompress
```
The proposed naming convention:
- Compressor/Decompressor for functions which return type, like Reader/Writer/GeneralPurposeAllocator
- compressor/compressor for functions which are initializers for that type, like reader/writer/allocator
- compress/decompress for one shot operations, accepts reader/writer pair, like read/write/alloc
```Zig
/// Compress from reader and write compressed data to the writer.
fn compress(reader: anytype, writer: anytype, options: Options) !void
/// Create Compressor which outputs the writer.
fn compressor(writer: anytype, options: Options) !Compressor(@TypeOf(writer))
/// Compressor type
fn Compressor(comptime WriterType: type) type
/// Decompress from reader and write plain data to the writer.
fn decompress(reader: anytype, writer: anytype) !void
/// Create Decompressor which reads from reader.
fn decompressor(reader: anytype) Decompressor(@TypeOf(reader)
/// Decompressor type
fn Decompressor(comptime ReaderType: type) type
```
Comparing this implementation with the one we currently have in Zig's standard library (std).
Std is roughly 1.2-1.4 times slower in decompression, and 1.1-1.2 times slower in compression. Compressed sizes are pretty much same in both cases.
More resutls in [this](https://github.com/ianic/flate) repo.
This library uses static allocations for all structures, doesn't require allocator. That makes sense especially for deflate where all structures, internal buffers are allocated to the full size. Little less for inflate where we std version uses less memory by not preallocating to theoretical max size array which are usually not fully used.
For deflate this library allocates 395K while std 779K.
For inflate this library allocates 74.5K while std around 36K.
Inflate difference is because we here use 64K history instead of 32K in std.
If merged existing usage of compress gzip/zlib/deflate need some changes. Here is example with necessary changes in comments:
```Zig
const std = @import("std");
// To get this file:
// wget -nc -O war_and_peace.txt https://www.gutenberg.org/ebooks/2600.txt.utf-8
const data = @embedFile("war_and_peace.txt");
pub fn main() !void {
var gpa = std.heap.GeneralPurposeAllocator(.{}){};
defer std.debug.assert(gpa.deinit() == .ok);
const allocator = gpa.allocator();
try oldDeflate(allocator);
try new(std.compress.flate, allocator);
try oldZlib(allocator);
try new(std.compress.zlib, allocator);
try oldGzip(allocator);
try new(std.compress.gzip, allocator);
}
pub fn new(comptime pkg: type, allocator: std.mem.Allocator) !void {
var buf = std.ArrayList(u8).init(allocator);
defer buf.deinit();
// Compressor
var cmp = try pkg.compressor(buf.writer(), .{});
_ = try cmp.write(data);
try cmp.finish();
var fbs = std.io.fixedBufferStream(buf.items);
// Decompressor
var dcp = pkg.decompressor(fbs.reader());
const plain = try dcp.reader().readAllAlloc(allocator, std.math.maxInt(usize));
defer allocator.free(plain);
try std.testing.expectEqualSlices(u8, data, plain);
}
pub fn oldDeflate(allocator: std.mem.Allocator) !void {
const deflate = std.compress.v1.deflate;
// Compressor
var buf = std.ArrayList(u8).init(allocator);
defer buf.deinit();
// Remove allocator
// Rename deflate -> flate
var cmp = try deflate.compressor(allocator, buf.writer(), .{});
_ = try cmp.write(data);
try cmp.close(); // Rename to finish
cmp.deinit(); // Remove
// Decompressor
var fbs = std.io.fixedBufferStream(buf.items);
// Remove allocator and last param
// Rename deflate -> flate
// Remove try
var dcp = try deflate.decompressor(allocator, fbs.reader(), null);
defer dcp.deinit(); // Remove
const plain = try dcp.reader().readAllAlloc(allocator, std.math.maxInt(usize));
defer allocator.free(plain);
try std.testing.expectEqualSlices(u8, data, plain);
}
pub fn oldZlib(allocator: std.mem.Allocator) !void {
const zlib = std.compress.v1.zlib;
var buf = std.ArrayList(u8).init(allocator);
defer buf.deinit();
// Compressor
// Rename compressStream => compressor
// Remove allocator
var cmp = try zlib.compressStream(allocator, buf.writer(), .{});
_ = try cmp.write(data);
try cmp.finish();
cmp.deinit(); // Remove
var fbs = std.io.fixedBufferStream(buf.items);
// Decompressor
// decompressStream => decompressor
// Remove allocator
// Remove try
var dcp = try zlib.decompressStream(allocator, fbs.reader());
defer dcp.deinit(); // Remove
const plain = try dcp.reader().readAllAlloc(allocator, std.math.maxInt(usize));
defer allocator.free(plain);
try std.testing.expectEqualSlices(u8, data, plain);
}
pub fn oldGzip(allocator: std.mem.Allocator) !void {
const gzip = std.compress.v1.gzip;
var buf = std.ArrayList(u8).init(allocator);
defer buf.deinit();
// Compressor
// Rename compress => compressor
// Remove allocator
var cmp = try gzip.compress(allocator, buf.writer(), .{});
_ = try cmp.write(data);
try cmp.close(); // Rename to finisho
cmp.deinit(); // Remove
var fbs = std.io.fixedBufferStream(buf.items);
// Decompressor
// Rename decompress => decompressor
// Remove allocator
// Remove try
var dcp = try gzip.decompress(allocator, fbs.reader());
defer dcp.deinit(); // Remove
const plain = try dcp.reader().readAllAlloc(allocator, std.math.maxInt(usize));
defer allocator.free(plain);
try std.testing.expectEqualSlices(u8, data, plain);
}
```
This also makes a long-overdue change of extracting common state from
Build into a shared Graph object.
Getting the semantics right for these flags turned out to be quite
tricky. In the end it works like this:
* The override only happens when the target is fully native, with no
additional query parameters, such as versions or CPU features added.
* The override affects the resolved Target but leaves the original Query
unmodified.
* The "is native?" detection logic operates on the original, unmodified
query. This makes it possible to provide invalid host target
information, causing confusing errors to occur. Don't do that.
There are some minor breaking changes to std.Build API such as the fact
that `b.zig_exe` is now moved to `b.graph.zig_exe`, as well as a handful
of other similar flags.
This change is seemingly insignificant but I actually agonized over this
for three days. Some other things I considered:
* (status quo in master branch) make Compile step creation functions
accept a Target.Query and delete the ResolvedTarget struct.
- downside: redundantly resolve target queries many times
* same as before but additionally add a hash map to cache target query
resolutions.
- downside: now there is a hash map that doesn't actually need to
exist, just to make the API more ergonomic.
* add is_native_os and is_native_abi fields to std.Target and use it
directly as the result of resolving a target query.
- downside: they really don't belong there. They would be available
as comptime booleans via `@import("builtin")` but they should not
be exposed that way.
With this change the downsides are:
* the option name of addExecutable and friends is `target` instead of
`resolved_target` matching the type name.
- upside: this does not break compatibility with existing build
scripts
* you likely end up seeing `target.result.cpu.arch` rather than
`target.cpu.arch`.
- upside: this is an improvement over `target.target.cpu.arch` which
it was before this commit.
- downside: `b.host.target` is now `b.host.result`.
Introduce the concept of "target query" and "resolved target". A target
query is what the user specifies, with some things left to default. A
resolved target has the default things discovered and populated.
In the future, std.zig.CrossTarget will be rename to std.Target.Query.
Introduces `std.Build.resolveTargetQuery` to get from one to the other.
The concept of `main_mod_path` is gone, no longer supported. You have to
put the root source file at the module root now.
* remove deprecated API
* update build.zig for the breaking API changes in this branch
* move std.Build.Step.Compile.BuildId to std.zig.BuildId
* add more options to std.Build.ExecutableOptions, std.Build.ObjectOptions,
std.Build.SharedLibraryOptions, std.Build.StaticLibraryOptions, and
std.Build.TestOptions.
* remove `std.Build.constructCMacro`. There is no use for this API.
* deprecate `std.Build.Step.Compile.defineCMacro`. Instead,
`std.Build.Module.addCMacro` is provided.
- remove `std.Build.Step.Compile.defineCMacroRaw`.
* deprecate `std.Build.Step.Compile.linkFrameworkNeeded`
- use `std.Build.Module.linkFramework`
* deprecate `std.Build.Step.Compile.linkFrameworkWeak`
- use `std.Build.Module.linkFramework`
* move more logic into `std.Build.Module`
* allow `target` and `optimize` to be `null` when creating a Module.
Along with other fields, those unspecified options will be inherited
from parent `Module` when inserted into an import table.
* the `target` field of `addExecutable` is now required. pass `b.host`
to get the host target.
For computing the zig version number, pass --abbrev=9 rather than
requiring the user to set their git configuration in order to make zig
versions match the standard.
Justification: exec, execv etc are unix concepts and portable version
should be called differently.
Do no touch non-Zig code. Adjust error names as well, if associated.
Closes#5853.
Previous commit caused this error to be printed:
zig build-exe zig Debug aarch64-macos-none: error: memory usage peaked
at 6323765248 bytes, exceeding the declared upper bound of 5200000000
I observed some out-of-memory conditions happening on one of the CI
servers. This should make it avoid scheduling compiler builds at the
same time as other memory-intensive operations.
The build was previously failing with `error: unknown command: -print-file-name=libstdc++.a`
because the command invocation was
`zig -print-file-name=libstdc++.a`
instead of
`zig c++ -print-file-name=libstdc++.a`
note: .cxx_compiler_arg1 = "" instead of undefined to avoid breaking existing setups without requiring to run cmake again.
* start renaming "package" to "module" (see #14307)
- build system gains `main_mod_path` and `main_pkg_path` is still
there but it is deprecated.
* eliminate the object-oriented memory management style of what was
previously `*Package`. Now it is `*Package.Module` and all pointers
point to externally managed memory.
* fixes to get the new Fetch.zig code working. The previous commit was
work-in-progress. There are still two commented out code paths, the
one that leads to `Compilation.create` and the one for `zig build`
that fetches the entire dependency tree and creates the required
modules for the build runner.
- Adds `illumos` to the `Target.Os.Tag` enum. A new function,
`isSolarish` has been added that returns true if the tag is either
Solaris or Illumos. This matches the naming convention found in Rust's
`libc` crate[1].
- Add the tag wherever `.solaris` is being checked against.
- Check for the C pre-processor macro `__illumos__` in CMake to set the
proper target tuple. Illumos distros patch their compilers to have
this in the "built-in" set (verified with `echo | cc -dM -E -`).
Alternatively you could check the output of `uname -o`.
Right now, both Solaris and Illumos import from `c/solaris.zig`. In the
future it may be worth putting the shared ABI bits in a base file, and
mixing that in with specific `c/solaris.zig`/`c/illumos.zig` files.
[1]: 6e02a329a2/src/unix/solarish
This option exists for fast iteration during compiler development, since
avoiding codegen (but still performing semantic analysis) causes compile
errors to appear around twice as fast. This option was broken when the
`emit_bin` property on `std.Build.Step.Compile` was removed.
Now, when this option is passed, we add a direct install dependency on
the compile step itself, so that it is always run.
This reverts commit 4d1432299f.
Please don't hard-code unrelated concerns this way. build.zig should not
have awareness of the naming conventions for cmake build directories.
We already have a zig build step for this: test-fmt.
* Rename the test-fmt step to check-fmt.
test-fmt sounds like it runs tests for `zig fmt` itself (lib/std/zig/render.zig).
* Use it instead of `zig fmt --check` in the CI scripts.
* Also use it in CI scripts that didn't have this check before.
Detect system libcxx name with the CMake build system and convey that
information to build.zig via config.h so that the same system libcxx is
used for stage3.
This undoes the hacky search 901457d173 .
closes#17018
* introduce LazyPath.cwd_relative variant and use it for --zig-lib-dir. closes#12685
* move overrideZigLibDir and setMainPkgPath to options fields set once
and then never mutated.
* avoid introducing Build/util.zig
* use doc comments for deprecation notices so that they show up in
generated documentation.
* introduce InstallArtifact.Options, accept it as a parameter to
addInstallArtifact, and move override_dest_dir into it. Instead of
configuring the installation via Compile step, configure the
installation via the InstallArtifact step. In retrospect this is
obvious.
* remove calls to pushInstalledFile in InstallArtifact. See #14943
* rewrite InstallArtifact to not incorrectly observe whether a Compile
step has any generated outputs. InstallArtifact is meant to trigger
output generation.
* fix child process evaluation code handling of `-fno-emit-bin`.
* don't store out_h_filename, out_ll_filename, etc., pointlessly. these
are all just simple extensions appended to the root name.
* make emit_directory optional. It's possible to have nothing outputted,
for example, if you're just type-checking.
* avoid passing -femit-foo/-fno-emit-foo when it is the default
* rename ConfigHeader.getTemplate to getOutput
* deprecate addOptionArtifact
* update the random number seed of Options step caching.
* avoid using `inline for` pointlessly
* avoid using `override_Dest_dir` pointlessly
* avoid emitting an executable pointlessly in test cases
Removes forceBuild and forceEmit. Let's consider these additions separately.
Nearly all of the usage sites were suspicious.