Commit graph

70 commits

Author SHA1 Message Date
Kendall Condon
8284da2f3d flate.Compress: simplify huffman node comparisons
Instead of comparing each field, nodes are now compared as 32-bit
values where `freq` is in the most significant bits.
2025-11-22 22:11:33 -08:00
Kendall Condon
f50c647977 add deflate compression, simplify decompression
Implements deflate compression from scratch. A history window is kept in
the writer's buffer for matching and a chained hash table is used to
find matches. Tokens are accumulated until a threshold is reached and
then outputted as a block. Flush is used to indicate end of stream.

Additionally, two other deflate writers are provided:
* `Raw` writes only in store blocks (the uncompressed bytes). It
  utilizes data vectors to efficiently send block headers and data.
* `Huffman` only performs Huffman compression on data and no matching.

The above are also able to take advantage of writer semantics since they
do not need to keep a history.

Literal and distance code parameters in `token` have also been reworked.
Their parameters are now derived mathematically, however the more
expensive ones are still obtained through a lookup table (expect on
ReleaseSmall).

Decompression bit reading has been greatly simplified, taking advantage
of the ability to peek on the underlying reader. Additionally, a few
bugs with limit handling have been fixed.
2025-09-30 18:28:47 -07:00
Andrew Kelley
79f267f6b9 std.Io: delete GenericReader
and delete deprecated alias std.io
2025-08-29 17:14:26 -07:00
Andrew Kelley
30b41dc510 std.compress.zstd.Decompress fixes
* std.Io.Reader: appendRemaining no longer supports alignment and has
  different rules about how exceeding limit. Fixed bug where it would
  return success instead of error.StreamTooLong like it was supposed to.

* std.Io.Reader: simplify appendRemaining and appendRemainingUnlimited
  to be implemented based on std.Io.Writer.Allocating

* std.Io.Writer: introduce unreachableRebase

* std.Io.Writer: remove minimum_unused_capacity from Allocating. maybe
  that flexibility could have been handy, but let's see if anyone
  actually needs it. The field is redundant with the superlinear growth
  of ArrayList capacity.

* std.Io.Writer: growingRebase also ensures total capacity on the
  preserve parameter, making it no longer necessary to do
  ensureTotalCapacity at the usage site of decompression streams.

* std.compress.flate.Decompress: fix rebase not taking into account seek

* std.compress.zstd.Decompress: split into "direct" and "indirect" usage
  patterns depending on whether a buffer is provided to init, matching
  how flate works. Remove some overzealous asserts that prevented buffer
  expansion from within rebase implementation.

* std.zig: fix readSourceFileToAlloc returning an overaligned slice
  which was difficult to free correctly.

fixes #24608
2025-08-15 10:44:35 -07:00
Andrew Kelley
af7e142485 std.Io.Writer: introduce rebase to the vtable
fixes #24814
2025-08-14 12:56:37 -07:00
Isaac Freund
b8124d9c0b std.io.Writer.Allocating: rename getWritten() to written()
This "get" is useless noise and was copied from FixedBufferWriter.
Since this API has not yet landed in a release, now is a good time
to make the breaking change to fix this.
2025-08-13 01:43:52 -07:00
Andrew Kelley
df46ee61c4 std.Io.Writer.Allocating: configurable bump amount 2025-08-08 19:22:08 -07:00
Andrew Kelley
91a81d3846 std.compress.flate.Decompress: fix buffer size in test 2025-08-08 15:03:05 -07:00
Ryan Liptak
23fff3442d flate: Handle invalid block type
Fixes `panic: invalid enum value` when the type bits had the u2 value of 3.

Contributes towards #24741
2025-08-08 12:27:25 -07:00
Igor Anić
6de2310035
flate change bit reader Bits to usize (#24719)
Don't see why byte returned from specialPeek needs to be shifted by
remaining_needed_bits.
I believe that decision in specialPeek should be done on the number of
the remaining bits not of the content of that bits.

Some test result are changed, but they are now consistent with the
original state as found in:
5f790464b0/lib/std/compress/flate/Decompress.zig

Changing Bits from usize to u32 or u64 now returns same results.

* flate: simplify peekBitsEnding

`peekBits` returns at most asked number of bits. Fails with EndOfStream
when there are no available bits. If there are less bits available than
asked still returns that available bits.
Hopefully this change better reflects intention. On first input stream
peek error we break the loop.
2025-08-07 14:40:08 -07:00
Igor Anić
d2149106a6 flate zlib fix end of block reading
`n` is wanted number of bits to toss
`buffered_n` is actual number of bytes in `next_int`
2025-08-05 17:09:41 -07:00
Ian Johnson
96be6f6566 std.compress.flate.Decompress: return correct size for unbuffered decompression
Closes #24686

As a bonus, this commit also makes the `git.zig` "testing `main`" compile again.
2025-08-04 19:32:14 -07:00
Andrew Kelley
a6f7927764 std.compress.flate.Decompress: use 64 buffered bits
will have to find out why usize doesn't work for 32 bit targets some
other time
2025-08-01 09:04:27 -07:00
Andrew Kelley
64814dc986 std.compress.flate.Decompress: respect stream limit 2025-07-31 22:10:11 -07:00
Andrew Kelley
a7808892f7 std.compress.flate.Decompress: be in indirect or direct mode
depending on whether buffered
2025-07-31 22:10:11 -07:00
Andrew Kelley
6eac56caf7 std.compress.flate.Decompress: allow users to swap out Writer 2025-07-31 22:10:11 -07:00
Andrew Kelley
c9ff068391 std.compress: fix discard impl and flate error detection 2025-07-31 22:10:11 -07:00
Andrew Kelley
111305678c std: match readVec fn prototype exactly
this is not necessary according to zig language, but works around a flaw
in the C backend
2025-07-31 22:10:11 -07:00
Andrew Kelley
84e4343b0c fix test failures by adding readVec 2025-07-31 22:10:11 -07:00
Andrew Kelley
afe9f3a9ec std.compress.flate.Decompress: implement readVec and discard 2025-07-31 22:10:11 -07:00
Andrew Kelley
6bcced31a0 fix 32-bit compilation 2025-07-31 22:10:11 -07:00
Andrew Kelley
42b10f08cc std.compress.flate.Decompress: delete bad unit tests
if I remove the last input byte from "don't read past deflate stream's
end" (on master branch), the test fails with error.EndOfStream. what,
then, is it supposed to be testing?
2025-07-31 22:10:11 -07:00
Andrew Kelley
5f790464b0 std.compress.flate.Decompress: hashing is out of scope
This API provides the data; applications can verify their own checksums.
2025-07-31 22:10:11 -07:00
Andrew Kelley
4741a16d9a putting stuff back does not require mutation 2025-07-31 22:10:11 -07:00
Andrew Kelley
05ce1f99a6 compiler: update to new flate API 2025-07-31 22:10:11 -07:00
Andrew Kelley
2569f4ff85 simplify tossBitsEnding 2025-07-31 22:10:11 -07:00
Andrew Kelley
5bc63794cc fix takeBitsEnding 2025-07-31 22:10:11 -07:00
Andrew Kelley
63f496c4f9 make takeBits deal with integers only 2025-07-31 22:10:11 -07:00
Andrew Kelley
c00fb86db6 fix peekBitsEnding 2025-07-31 22:10:11 -07:00
Andrew Kelley
8ab91a6fe9 error.EndOfStream disambiguation 2025-07-31 22:10:11 -07:00
Andrew Kelley
f644f40702 implement tossBitsEnding 2025-07-31 22:10:11 -07:00
Andrew Kelley
2d8d0dd9b0 std.compress.flate.Decompress: unfuck the test suite 2025-07-31 22:10:11 -07:00
Andrew Kelley
c684b21b4f simplify test cases 2025-07-31 22:10:11 -07:00
Andrew Kelley
ac4fbb427b std.compress.flate.Decompress: don't compute checksums
These have no business being in-bound; simply provide the expected
values to user code for maximum flexibility.
2025-07-31 22:10:11 -07:00
Andrew Kelley
5f571f53d6 refactor gzip test cases
zig newbies love using for loops in unit tests
2025-07-31 22:10:11 -07:00
Andrew Kelley
e73ca2444e std.compress.flate.Decompress: implement peekBitsEnding and writeMatch 2025-07-31 22:10:11 -07:00
Andrew Kelley
7bf91d705c fix bit read not at eof 2025-07-31 22:10:11 -07:00
Andrew Kelley
73e5594c78 std.compress.flate.Decompress: fix bit read at eof 2025-07-31 22:10:11 -07:00
Andrew Kelley
9c8cb777d4 std.compress.flate.Decompress: implement more bit reading 2025-07-31 22:10:11 -07:00
Andrew Kelley
6509fa1cf3 std.compress.flate.Decompress: passing basic test case 2025-07-31 22:10:11 -07:00
Andrew Kelley
88ca750209 std.compress.flate.Decompress: add rebase impl 2025-07-31 22:10:11 -07:00
Andrew Kelley
1b43551190 std.Io: remove BitWriter 2025-07-31 22:10:11 -07:00
Andrew Kelley
824c157e0c std.compress.flate: finish reorganizing 2025-07-31 22:10:11 -07:00
Andrew Kelley
a4f05a4588 delete flate implementation 2025-07-31 22:10:11 -07:00
Andrew Kelley
83513ade35 std.compress: rework flate to new I/O API 2025-07-31 22:10:11 -07:00
Andrew Kelley
9f27d770a1 std.io: deprecated Reader/Writer; introduce new API 2025-07-07 22:43:51 -07:00
Shun Sakai
a3ad0a2f77 std.compress.flate.Lookup: Replace invisible doc comments with top-level doc comments
I think it would be better if this invisible doc comments is top-level
doc comments rather than doc comments. Because it is at the start of a
source file. This makes the doc comments visible.
2025-01-22 23:34:57 +09:00
mlugg
0fe3fd01dd
std: update std.builtin.Type fields to follow naming conventions
The compiler actually doesn't need any functional changes for this: Sema
does reification based on the tag indices of `std.builtin.Type` already!
So, no zig1.wasm update is necessary.

This change is necessary to disallow name clashes between fields and
decls on a type, which is a prerequisite of #9938.
2024-08-28 08:39:59 +01:00
Jora Troosh
13070448f5
std: fix typos (#20560) 2024-07-09 14:25:42 -07:00
Pavel Verigo
d4d1efeb3e std.compress.flate: fix panic when reading into empty buffer 2024-05-09 15:51:42 -07:00