Commit graph

21 commits

Author SHA1 Message Date
Andrew Kelley
9f27d770a1 std.io: deprecated Reader/Writer; introduce new API 2025-07-07 22:43:51 -07:00
Scott Redig
ff551374a0 fix typo of anytype to type
This seems like a simple typo.  The values are immediately used in a struct as types, so there's no reason to use anytype here, afaik.
2025-02-05 13:58:15 +01:00
Andrew Kelley
aa852f737b improve documentation in std
A lot of these "shorthand" doc comments were redundant, low quality
filler content. Better to let the actual modules speak for themselves
with top level doc comments rather than trying to document their
aliases.
2024-03-10 18:13:30 -07:00
Igor Anić
fd9db4962c reorganize compress package root folder 2024-02-14 23:34:13 +01:00
Igor Anić
2457b68b2f remove v1 deflate implementation 2024-02-14 22:34:13 +01:00
Igor Anić
d645114f7e add deflate implemented from first principles
Zig deflate compression/decompression implementation. It supports compression and decompression of gzip, zlib and raw deflate format.

Fixes #18062.

This PR replaces current compress/gzip and compress/zlib packages. Deflate package is renamed to flate. Flate is common name for deflate/inflate where deflate is compression and inflate decompression.

There are breaking change. Methods signatures are changed because of removal of the allocator, and I also unified API for all three namespaces (flate, gzip, zlib).

Currently I put old packages under v1 namespace they are still available as compress/v1/gzip, compress/v1/zlib, compress/v1/deflate. Idea is to give users of the current API little time to postpone analyzing what they had to change. Although that rises question when it is safe to remove that v1 namespace.

Here is current API in the compress package:

```Zig
// deflate
    fn compressor(allocator, writer, options) !Compressor(@TypeOf(writer))
    fn Compressor(comptime WriterType) type

    fn decompressor(allocator, reader, null) !Decompressor(@TypeOf(reader))
    fn Decompressor(comptime ReaderType: type) type

// gzip
    fn compress(allocator, writer, options) !Compress(@TypeOf(writer))
    fn Compress(comptime WriterType: type) type

    fn decompress(allocator, reader) !Decompress(@TypeOf(reader))
    fn Decompress(comptime ReaderType: type) type

// zlib
    fn compressStream(allocator, writer, options) !CompressStream(@TypeOf(writer))
    fn CompressStream(comptime WriterType: type) type

    fn decompressStream(allocator, reader) !DecompressStream(@TypeOf(reader))
    fn DecompressStream(comptime ReaderType: type) type

// xz
   fn decompress(allocator: Allocator, reader: anytype) !Decompress(@TypeOf(reader))
   fn Decompress(comptime ReaderType: type) type

// lzma
    fn decompress(allocator, reader) !Decompress(@TypeOf(reader))
    fn Decompress(comptime ReaderType: type) type

// lzma2
    fn decompress(allocator, reader, writer !void

// zstandard:
    fn DecompressStream(ReaderType, options) type
    fn decompressStream(allocator, reader) DecompressStream(@TypeOf(reader), .{})
    struct decompress
```

The proposed naming convention:
 - Compressor/Decompressor for functions which return type, like Reader/Writer/GeneralPurposeAllocator
 - compressor/compressor for functions which are initializers for that type, like reader/writer/allocator
 - compress/decompress for one shot operations, accepts reader/writer pair, like read/write/alloc

```Zig
/// Compress from reader and write compressed data to the writer.
fn compress(reader: anytype, writer: anytype, options: Options) !void

/// Create Compressor which outputs the writer.
fn compressor(writer: anytype, options: Options) !Compressor(@TypeOf(writer))

/// Compressor type
fn Compressor(comptime WriterType: type) type

/// Decompress from reader and write plain data to the writer.
fn decompress(reader: anytype, writer: anytype) !void

/// Create Decompressor which reads from reader.
fn decompressor(reader: anytype) Decompressor(@TypeOf(reader)

/// Decompressor type
fn Decompressor(comptime ReaderType: type) type

```

Comparing this implementation with the one we currently have in Zig's standard library (std).
Std is roughly 1.2-1.4 times slower in decompression, and 1.1-1.2 times slower in compression. Compressed sizes are pretty much same in both cases.
More resutls in [this](https://github.com/ianic/flate) repo.

This library uses static allocations for all structures, doesn't require allocator. That makes sense especially for deflate where all structures, internal buffers are allocated to the full size. Little less for inflate where we std version uses less memory by not preallocating to theoretical max size array which are usually not fully used.

For deflate this library allocates 395K while std 779K.
For inflate this library allocates 74.5K while std around 36K.

Inflate difference is because we here use 64K history instead of 32K in std.

If merged existing usage of compress gzip/zlib/deflate need some changes. Here is example with necessary changes in comments:

```Zig

const std = @import("std");

// To get this file:
// wget -nc -O war_and_peace.txt https://www.gutenberg.org/ebooks/2600.txt.utf-8
const data = @embedFile("war_and_peace.txt");

pub fn main() !void {
    var gpa = std.heap.GeneralPurposeAllocator(.{}){};
    defer std.debug.assert(gpa.deinit() == .ok);
    const allocator = gpa.allocator();

    try oldDeflate(allocator);
    try new(std.compress.flate, allocator);

    try oldZlib(allocator);
    try new(std.compress.zlib, allocator);

    try oldGzip(allocator);
    try new(std.compress.gzip, allocator);
}

pub fn new(comptime pkg: type, allocator: std.mem.Allocator) !void {
    var buf = std.ArrayList(u8).init(allocator);
    defer buf.deinit();

    // Compressor
    var cmp = try pkg.compressor(buf.writer(), .{});
    _ = try cmp.write(data);
    try cmp.finish();

    var fbs = std.io.fixedBufferStream(buf.items);
    // Decompressor
    var dcp = pkg.decompressor(fbs.reader());

    const plain = try dcp.reader().readAllAlloc(allocator, std.math.maxInt(usize));
    defer allocator.free(plain);
    try std.testing.expectEqualSlices(u8, data, plain);
}

pub fn oldDeflate(allocator: std.mem.Allocator) !void {
    const deflate = std.compress.v1.deflate;

    // Compressor
    var buf = std.ArrayList(u8).init(allocator);
    defer buf.deinit();
    // Remove allocator
    // Rename deflate -> flate
    var cmp = try deflate.compressor(allocator, buf.writer(), .{});
    _ = try cmp.write(data);
    try cmp.close(); // Rename to finish
    cmp.deinit(); // Remove

    // Decompressor
    var fbs = std.io.fixedBufferStream(buf.items);
    // Remove allocator and last param
    // Rename deflate -> flate
    // Remove try
    var dcp = try deflate.decompressor(allocator, fbs.reader(), null);
    defer dcp.deinit(); // Remove

    const plain = try dcp.reader().readAllAlloc(allocator, std.math.maxInt(usize));
    defer allocator.free(plain);
    try std.testing.expectEqualSlices(u8, data, plain);
}

pub fn oldZlib(allocator: std.mem.Allocator) !void {
    const zlib = std.compress.v1.zlib;

    var buf = std.ArrayList(u8).init(allocator);
    defer buf.deinit();

    // Compressor
    // Rename compressStream => compressor
    // Remove allocator
    var cmp = try zlib.compressStream(allocator, buf.writer(), .{});
    _ = try cmp.write(data);
    try cmp.finish();
    cmp.deinit(); // Remove

    var fbs = std.io.fixedBufferStream(buf.items);
    // Decompressor
    // decompressStream => decompressor
    // Remove allocator
    // Remove try
    var dcp = try zlib.decompressStream(allocator, fbs.reader());
    defer dcp.deinit(); // Remove

    const plain = try dcp.reader().readAllAlloc(allocator, std.math.maxInt(usize));
    defer allocator.free(plain);
    try std.testing.expectEqualSlices(u8, data, plain);
}

pub fn oldGzip(allocator: std.mem.Allocator) !void {
    const gzip = std.compress.v1.gzip;

    var buf = std.ArrayList(u8).init(allocator);
    defer buf.deinit();

    // Compressor
    // Rename compress => compressor
    // Remove allocator
    var cmp = try gzip.compress(allocator, buf.writer(), .{});
    _ = try cmp.write(data);
    try cmp.close(); // Rename to finisho
    cmp.deinit(); // Remove

    var fbs = std.io.fixedBufferStream(buf.items);
    // Decompressor
    // Rename decompress => decompressor
    // Remove allocator
    // Remove try
    var dcp = try gzip.decompress(allocator, fbs.reader());
    defer dcp.deinit(); // Remove

    const plain = try dcp.reader().readAllAlloc(allocator, std.math.maxInt(usize));
    defer allocator.free(plain);
    try std.testing.expectEqualSlices(u8, data, plain);
}

```
2024-02-14 18:28:20 +01:00
Jacob Young
4dfca01de4 gzip: implement compression 2024-01-29 14:30:23 -08:00
dweiller
765a6d3413 std.compress.zstd: renamed from std.compress.zstandard 2023-02-22 00:11:20 +11:00
dweiller
61cb514387 std.compress: add zstandard decompressor 2023-02-20 09:09:05 +11:00
fn ⌃ ⌥
e03d6c42ea Delete redundant lzma/lzma2 prefix in function/struct names 2023-02-05 06:52:28 -08:00
fn ⌃ ⌥
8e2af21cd9 Add LZMA decoder 2023-02-02 11:59:56 -08:00
Andrew Kelley
ea9ded8758 std.compress.xz public API cleanup
* add xz to std.compress
 * prefer importing std.zig by file name, to reduce reliance on the
   standard library being a special case.
 * extract some types from inside generic functions. These types are the
   same regardless of the generic parameters.
 * expose some more types in the std.compress.xz namespace.
 * rename xz.stream to xz.decompress
 * rename check.Kind to Check
 * use std.leb for LEB instead of a redundant implementation
2023-01-24 15:24:19 -07:00
fn ⌃ ⌥
f85c01d4c7 Implement gzip header CRC check.
From RFC 1952:

> If FHCRC is set, a CRC16 for the gzip header is present,
> immediately before the compressed data. The CRC16 consists
> of the two least significant bytes of the CRC32 for all
> bytes of the gzip header up to and not including the CRC16.
2023-01-22 00:33:29 -05:00
Veikka Tuominen
488e1e5f51 stage2: small fixes + adjustments to std tests 2022-06-11 23:49:33 +03:00
Veikka Tuominen
1a7b4ddeae std: disable tests that crash stage2 2022-06-03 20:21:20 +03:00
Hadrien Dorio
490f067de8
compress: add a deflate compressor
Replaces the inflate API from `inflateStream(reader: anytype, window_slice: []u8)` to
`decompressor(allocator: mem.Allocator, reader: anytype, dictionary: ?[]const u8)` and
`compressor(allocator: mem.Allocator, writer: anytype, options: CompressorOptions)`
2022-01-23 19:30:06 +01:00
Andrew Kelley
d29871977f remove redundant license headers from zig standard library
We already have a LICENSE file that covers the Zig Standard Library. We
no longer need to remind everyone that the license is MIT in every single
file.

Previously this was introduced to clarify the situation for a fork of
Zig that made Zig's LICENSE file harder to find, and replaced it with
their own license that required annual payments to their company.
However that fork now appears to be dead. So there is no need to
reinforce the copyright notice in every single file.
2021-08-24 12:25:09 -07:00
LemonBoy
134f5fd3d6 std: Update test "" to test where it makes sense 2021-01-22 15:46:58 +01:00
Frank Denis
6c2e0c2046 Year++ 2020-12-31 15:45:24 -08:00
LemonBoy
9fe4c89230 std: Add a gzip decoder 2020-09-11 20:02:41 -04:00
LemonBoy
2328f40b7a std: Add DEFLATE and zlib decompressors 2020-09-07 14:59:46 -07:00