Zig deflate compression/decompression implementation. It supports compression and decompression of gzip, zlib and raw deflate format.
Fixes#18062.
This PR replaces current compress/gzip and compress/zlib packages. Deflate package is renamed to flate. Flate is common name for deflate/inflate where deflate is compression and inflate decompression.
There are breaking change. Methods signatures are changed because of removal of the allocator, and I also unified API for all three namespaces (flate, gzip, zlib).
Currently I put old packages under v1 namespace they are still available as compress/v1/gzip, compress/v1/zlib, compress/v1/deflate. Idea is to give users of the current API little time to postpone analyzing what they had to change. Although that rises question when it is safe to remove that v1 namespace.
Here is current API in the compress package:
```Zig
// deflate
fn compressor(allocator, writer, options) !Compressor(@TypeOf(writer))
fn Compressor(comptime WriterType) type
fn decompressor(allocator, reader, null) !Decompressor(@TypeOf(reader))
fn Decompressor(comptime ReaderType: type) type
// gzip
fn compress(allocator, writer, options) !Compress(@TypeOf(writer))
fn Compress(comptime WriterType: type) type
fn decompress(allocator, reader) !Decompress(@TypeOf(reader))
fn Decompress(comptime ReaderType: type) type
// zlib
fn compressStream(allocator, writer, options) !CompressStream(@TypeOf(writer))
fn CompressStream(comptime WriterType: type) type
fn decompressStream(allocator, reader) !DecompressStream(@TypeOf(reader))
fn DecompressStream(comptime ReaderType: type) type
// xz
fn decompress(allocator: Allocator, reader: anytype) !Decompress(@TypeOf(reader))
fn Decompress(comptime ReaderType: type) type
// lzma
fn decompress(allocator, reader) !Decompress(@TypeOf(reader))
fn Decompress(comptime ReaderType: type) type
// lzma2
fn decompress(allocator, reader, writer !void
// zstandard:
fn DecompressStream(ReaderType, options) type
fn decompressStream(allocator, reader) DecompressStream(@TypeOf(reader), .{})
struct decompress
```
The proposed naming convention:
- Compressor/Decompressor for functions which return type, like Reader/Writer/GeneralPurposeAllocator
- compressor/compressor for functions which are initializers for that type, like reader/writer/allocator
- compress/decompress for one shot operations, accepts reader/writer pair, like read/write/alloc
```Zig
/// Compress from reader and write compressed data to the writer.
fn compress(reader: anytype, writer: anytype, options: Options) !void
/// Create Compressor which outputs the writer.
fn compressor(writer: anytype, options: Options) !Compressor(@TypeOf(writer))
/// Compressor type
fn Compressor(comptime WriterType: type) type
/// Decompress from reader and write plain data to the writer.
fn decompress(reader: anytype, writer: anytype) !void
/// Create Decompressor which reads from reader.
fn decompressor(reader: anytype) Decompressor(@TypeOf(reader)
/// Decompressor type
fn Decompressor(comptime ReaderType: type) type
```
Comparing this implementation with the one we currently have in Zig's standard library (std).
Std is roughly 1.2-1.4 times slower in decompression, and 1.1-1.2 times slower in compression. Compressed sizes are pretty much same in both cases.
More resutls in [this](https://github.com/ianic/flate) repo.
This library uses static allocations for all structures, doesn't require allocator. That makes sense especially for deflate where all structures, internal buffers are allocated to the full size. Little less for inflate where we std version uses less memory by not preallocating to theoretical max size array which are usually not fully used.
For deflate this library allocates 395K while std 779K.
For inflate this library allocates 74.5K while std around 36K.
Inflate difference is because we here use 64K history instead of 32K in std.
If merged existing usage of compress gzip/zlib/deflate need some changes. Here is example with necessary changes in comments:
```Zig
const std = @import("std");
// To get this file:
// wget -nc -O war_and_peace.txt https://www.gutenberg.org/ebooks/2600.txt.utf-8
const data = @embedFile("war_and_peace.txt");
pub fn main() !void {
var gpa = std.heap.GeneralPurposeAllocator(.{}){};
defer std.debug.assert(gpa.deinit() == .ok);
const allocator = gpa.allocator();
try oldDeflate(allocator);
try new(std.compress.flate, allocator);
try oldZlib(allocator);
try new(std.compress.zlib, allocator);
try oldGzip(allocator);
try new(std.compress.gzip, allocator);
}
pub fn new(comptime pkg: type, allocator: std.mem.Allocator) !void {
var buf = std.ArrayList(u8).init(allocator);
defer buf.deinit();
// Compressor
var cmp = try pkg.compressor(buf.writer(), .{});
_ = try cmp.write(data);
try cmp.finish();
var fbs = std.io.fixedBufferStream(buf.items);
// Decompressor
var dcp = pkg.decompressor(fbs.reader());
const plain = try dcp.reader().readAllAlloc(allocator, std.math.maxInt(usize));
defer allocator.free(plain);
try std.testing.expectEqualSlices(u8, data, plain);
}
pub fn oldDeflate(allocator: std.mem.Allocator) !void {
const deflate = std.compress.v1.deflate;
// Compressor
var buf = std.ArrayList(u8).init(allocator);
defer buf.deinit();
// Remove allocator
// Rename deflate -> flate
var cmp = try deflate.compressor(allocator, buf.writer(), .{});
_ = try cmp.write(data);
try cmp.close(); // Rename to finish
cmp.deinit(); // Remove
// Decompressor
var fbs = std.io.fixedBufferStream(buf.items);
// Remove allocator and last param
// Rename deflate -> flate
// Remove try
var dcp = try deflate.decompressor(allocator, fbs.reader(), null);
defer dcp.deinit(); // Remove
const plain = try dcp.reader().readAllAlloc(allocator, std.math.maxInt(usize));
defer allocator.free(plain);
try std.testing.expectEqualSlices(u8, data, plain);
}
pub fn oldZlib(allocator: std.mem.Allocator) !void {
const zlib = std.compress.v1.zlib;
var buf = std.ArrayList(u8).init(allocator);
defer buf.deinit();
// Compressor
// Rename compressStream => compressor
// Remove allocator
var cmp = try zlib.compressStream(allocator, buf.writer(), .{});
_ = try cmp.write(data);
try cmp.finish();
cmp.deinit(); // Remove
var fbs = std.io.fixedBufferStream(buf.items);
// Decompressor
// decompressStream => decompressor
// Remove allocator
// Remove try
var dcp = try zlib.decompressStream(allocator, fbs.reader());
defer dcp.deinit(); // Remove
const plain = try dcp.reader().readAllAlloc(allocator, std.math.maxInt(usize));
defer allocator.free(plain);
try std.testing.expectEqualSlices(u8, data, plain);
}
pub fn oldGzip(allocator: std.mem.Allocator) !void {
const gzip = std.compress.v1.gzip;
var buf = std.ArrayList(u8).init(allocator);
defer buf.deinit();
// Compressor
// Rename compress => compressor
// Remove allocator
var cmp = try gzip.compress(allocator, buf.writer(), .{});
_ = try cmp.write(data);
try cmp.close(); // Rename to finisho
cmp.deinit(); // Remove
var fbs = std.io.fixedBufferStream(buf.items);
// Decompressor
// Rename decompress => decompressor
// Remove allocator
// Remove try
var dcp = try gzip.decompress(allocator, fbs.reader());
defer dcp.deinit(); // Remove
const plain = try dcp.reader().readAllAlloc(allocator, std.math.maxInt(usize));
defer allocator.free(plain);
try std.testing.expectEqualSlices(u8, data, plain);
}
```
The function returns the vector length, not the byte size of the vector or the bit size of individual elements. This distinction is very important and some usages of this function in the stdlib operated under these incorrect assumptions.
This adds scheme guessing when loading proxies, such that
`HTTP_PROXY=127.0.0.1` and such are valid now and it behaves identically
to `HTTP_PROXY=http://127.0.0.1`. Additionally fixed a typo that was
causing loadDefaultProxies to never populate the https_proxy.
This reverts commit 0c99ba1eab, reversing
changes made to 5f92b070bf.
This caused a CI failure when it landed in master branch due to a
128-bit `@byteSwap` in std.mem.
std_options.http_connection_pool_size removed in favor of
```
client.connection_pool.resize(client.allocator, size);
```
std_options.http_disable_tls will remove all https capability from
std.http when true. Any https request will error with
`error.TlsInitializationFailed`.
Solves #17051.
adds connectTunnel to form a HTTP CONNECT tunnel to the desired host.
Primarily implemented for proxies, but like connectUnix may be called by
any user.
adds loadDefaultProxies to load proxy information from common
environment variables (http_proxy, HTTP_PROXY, https_proxy, HTTPS_PROXY,
all_proxy, ALL_PROXY).
- no_proxy and NO_PROXY are currently unsupported.
splits proxy into http_proxy and https_proxy, adds headers field for
arbitrary headers to each proxy.
* Add missing period in Stack's description
This looks fine in the source, but looks bad when seen on the documentation website.
* Correct documentation for attachSegfaultHandler()
The description for attachSegfaultHandler() looks pretty bad without indicating that the stuff at the end is code
* Added missing 'the's in Queue.put's documentation
* Fixed several errors in Stack's documentation
`push()` and `pop()` were not styled as code
There was no period after `pop()`, which looks bad on the documentation.
* Fix multiple problems in base64.zig
Both "invalid"s in Base64.decoder were not capitalized.
Missing period in documentation of Base64DecoderWithIgnore.calcSizeUpperBound.
* Fix capitalization typos in bit_set.zig
In DynamicBitSetUnmanaged.deinit's and DynamicBitSet.deinit's documentation, "deinitializes" was uncapitalized.
* Fix typos in fifo.zig's documentation
Added a previously missing period to the end of the first line of LinearFifo.writableSlice's documentation.
Added missing periods to both lines of LinearFifo.pump's documentation.
* Fix typos in fmt.bufPrint's documentation
The starts of both lines were not capitalized.
* Fix minor documentation problems in fs/file.zig
Missing periods in documentation for Permissions.setReadOnly, PermissionsWindows.setReadOnly, MetadataUnix.created, MetadataLinux.created, and MetadataWindows.created.
* Fix a glaring typo in enums.zig
* Correct errors in fs.zig
* Fixed documentation problems in hash_map.zig
The added empty line in verify_context's documentation is needed, otherwise autodoc for some reason assumes that the list hasn't been terminated and continues reading off the rest of the documentation as if it were part of the second list item.
* Added lines between consecutive URLs in http.zig
Makes the documentation conform closer to what was intended.
* Fix wrongfully ended sentence in Uri.zig
* Handle wrongly entered comma in valgrind.zig.
* Add missing periods in wasm.zig's documentation
* Fix odd spacing in event/loop.zig
* Add missing period in http/Headers.zig
* Added missing period in io/limited_reader.zig
This isn't in the documentation due to what I guess is a limitation of autodoc, but it's clearly supposed to be. If it was, it would look pretty bad.
* Correct documentation in math/big/int.zig
* Correct formatting in math/big/rational.zig
* Create an actual link to ZIGNOR's paper.
* Fixed grammatical issues in sort/block.zig
This will not show up in the documentation currently.
* Fix typo in hash_map.zig
Addresses #17015 by introducing a new startWithOptions. The only option is currently is a flag
to use the provided URI as is, without modification when passed to the server. Normally, this
is not needed nor desired. However, some REST APIs may have requirements that cannot be satisfied
with the default handling.
Some servers will respond with the identity encoding, meaning no
encoding, especially when responding to range-get requests. Adding the
identity encoding stops the header parser from failing when it
encounters this.
`TailQueue` was implemented as a doubly-linked list, but named after an
abstract data type. This was inconsistent with `SinglyLinkedList`, which
can be used to implement an abstract data type, but is still named after
the implementation. Renaming `TailQueue` to `DoublyLinkedList` improves
consistency between the two type names, and should help discoverability.
`TailQueue` is now a deprecated alias of `DoublyLinkedList`.
Related to issues #1629 and #8233.
Most of this migration was performed automatically with `zig fmt`. There
were a few exceptions which I had to manually fix:
* `@alignCast` and `@addrSpaceCast` cannot be automatically rewritten
* `@truncate`'s fixup is incorrect for vectors
* Test cases are not formatted, and their error locations change
The 'Content-Length' header was inspected by mistake,
which makes it effectively impossible to use chunked
Transfer-Encoding when using the http client.
Tested locally with a HTTP server - data is properly sent
with POST method and the proper encoding declared, after the fix.