Commit graph

660 commits

Author SHA1 Message Date
Andrew Kelley
22db1e166a std.crypto.CertificateBundle: disable test on WASI 2023-01-02 16:57:15 -07:00
Andrew Kelley
7ed7bd247e std.crypto.tls: verify the common name matches 2023-01-02 16:57:15 -07:00
Andrew Kelley
244a97e8ad std.crypto.tls: certificate signature validation 2023-01-02 16:57:15 -07:00
Andrew Kelley
504070e8fc std.crypto.CertificateBundle: ignore duplicate certificates 2023-01-02 16:57:15 -07:00
Andrew Kelley
bbc074252c introduce std.crypto.CertificateBundle
for reading root certificate authority bundles from standard
installation locations on the file system. So far only Linux logic is
added.
2023-01-02 16:57:15 -07:00
Andrew Kelley
3237000d95 std.crypto.tls: rudimentary certificate parsing 2023-01-02 16:57:15 -07:00
Andrew Kelley
5d7eca6669 std.crypto.tls.Client: fix verify_data for batched handshakes 2023-01-02 16:57:15 -07:00
Andrew Kelley
e2c16d03ab std.crypto.tls.Client: support secp256r1 for handshake 2023-01-02 16:57:15 -07:00
Andrew Kelley
f460c21705 std.crypto.tls.Client: avoid hard-coded bytes in key_share 2023-01-02 16:57:15 -07:00
Andrew Kelley
7a23778384 std.crypto.tls: send a legacy session id
To support middlebox compatibility mode.
2023-01-02 16:57:15 -07:00
Andrew Kelley
e2efba76aa std.crypto.tls: refactor to remove mutations
build up the hello message with array concatenation and helper functions
rather than hard-coded offsets and lengths.
2023-01-02 16:57:15 -07:00
Andrew Kelley
41f4461cda std.crypto.tls.Client: verify the server's Finished message 2023-01-02 16:57:15 -07:00
Andrew Kelley
f6c3a86f0f std.crypto.tls.Client: remove unnecessary coercion 2023-01-02 16:57:15 -07:00
Andrew Kelley
8ef4dcd39f std.crypto.tls: add some benchmark data points
Looks like aegis-128l is the winner on baseline too.
2023-01-02 16:57:15 -07:00
Andrew Kelley
942b5b468f std.crypto.tls: implement the rest of the cipher suites
Also:
 * Use KeyPair.create() function
 * Don't bother with CCM
2023-01-02 16:57:15 -07:00
Andrew Kelley
93ab8be8d8 extract std.crypto.tls.Client into separate namespace 2023-01-02 16:57:15 -07:00
Andrew Kelley
02c33d02e0 std.crypto.Tls: parse encrypted extensions 2023-01-02 16:57:15 -07:00
Andrew Kelley
462b3ed69c std.crypto.Tls: handshake fixes
* Handle multiple handshakes in one encrypted record
 * Fix incorrect handshake length sent to server
2023-01-02 16:57:15 -07:00
Andrew Kelley
b97fc43baa std.crypto.Tls: client is working against some servers 2023-01-02 16:57:15 -07:00
Andrew Kelley
40a85506b2 std.crypto.Tls: add read/write methods 2023-01-02 16:57:15 -07:00
Andrew Kelley
595fff7cb6 std.crypto.Tls: decrypting handshake messages 2023-01-02 16:57:15 -07:00
Andrew Kelley
920e5bc4ff std.crypto.Tls: discard ChangeCipherSpec messages
The next step here is to decrypt encrypted records
2023-01-02 16:57:15 -07:00
Andrew Kelley
d2f5d0b199 std.crypto.Tls: parse the ServerHello handshake 2023-01-02 16:57:15 -07:00
Andrew Kelley
ba44513c2f std.http reorg; introduce std.crypto.Tls
TLS is capable of sending a Client Hello
2023-01-02 16:57:15 -07:00
Frank Denis
d86685ac96
sha3: define block_length as the rate, not as the state size (#14132)
In sponge-based constructions, the block size is not the same as
the state size. For practical purposes, it's the same as the rate.

Size this is a constant for a given type, we don't need to keep
a copy of that value in the state itself. Just use the constant
directly. This saves some bytes and may even be slightly faster.

More importantly:
Fixes #14128
2022-12-30 22:15:25 +00:00
Frank Denis
0d83487dd0 hkdf: add prk_length and extractInit()
The HKDF extract function uses HMAC under the hood, but requiring
applications to directly use HMAC functions reduces clarity and
feels like the wrong abstraction.

So, in order to get the PRK length, add a `prk_length` constant
that applications can use directly.

Also add an `extractInit()` function for cases where the keying
material has to be provided as multiple chunks.
2022-12-29 17:56:50 -05:00
Veikka Tuominen
622311fb9a update uses of overflow arithmetic builtins 2022-12-27 15:13:14 +02:00
Frank Denis
c9e3524d0b
HKDF allow expansion up to, and including <hash size> * 255 bytes (#14051)
Fixes #14050
2022-12-23 21:38:27 +00:00
r00ster91
aac2d6b56f std.builtin: rename Type.UnionField and Type.StructField's field_type to type 2022-12-17 14:11:33 +01:00
Veikka Tuominen
08b2d491bc update usages of @call 2022-12-13 13:14:20 +02:00
Frank Denis
14416b522e
Revert "std.crypto.aes: use software implementation in comptime context (#13792)" (#13798)
This reverts commit d4adf44200.

Unfortunately, this is not the right place to check if AES functions
are being used at comptime or not.
2022-12-07 03:49:20 +00:00
Frank Denis
d4adf44200
std.crypto.aes: use software implementation in comptime context (#13792)
Hardware-accelerated AES requires inline assembly code, which
cannot work at comptime.
2022-12-06 22:48:19 +00:00
Frank Denis
397881fefb treshold -> threshold 2022-12-05 19:25:10 -05:00
Frank Denis
4be1bb4aac
std.crypto benchmark: don't use a relative path to import std (#13772) 2022-12-05 04:44:14 +00:00
Frank Denis
7411be3c9e
std.crypto.edwards25519: add a rejectLowOrder() function (#13668)
Does what the name says: rejects generators of low-order groups.

`clearCofactor()` was previously used to do it, but for e.g.
cofactored signature verification, we don't need the result of an
actual multiplication. Only check that we didn't end up with a
low-order point, which is a faster operation.
2022-11-28 00:34:13 +01:00
Frank Denis
feb806a212
std.crypto.ed25519 incremental signatures: hash the fallback noise (#13643)
If the noise parameter was null, we didn't use any noise at all.

We unconditionally generated random noise (`noise2`) but didn't use it.

Spotted by @cryptocode, thanks!
2022-11-24 12:13:37 +01:00
Frank Denis
ea05223b63
std.crypto.auth: add AEGIS MAC (#13607)
* Update the AEGIS specification URL to the current draft

* std.crypto.auth: add AEGIS MAC

The Pelican-based authentication function of the AEGIS construction
can be used independently from authenticated encryption, as a faster
and more secure alternative to GHASH/POLYVAL/Poly1305.

We already expose GHASH, POLYVAL and Poly1305 for use outside AES-GCM
and ChaChaPoly, so there are no reasons not to expose the MAC from AEGIS
as well.

Like other 128-bit hash functions, finding a collision only requires
~2^64 attempts or inputs, which may still be acceptable for many
practical applications.

Benchmark (Apple M1):

    siphash128-1-3:       3222 MiB/s
             ghash:       8682 MiB/s
    aegis-128l mac:      12544 MiB/s

Benchmark (Zen 2):

    siphash128-1-3:       4732 MiB/s
             ghash:       5563 MiB/s
    aegis-128l mac:      19270 MiB/s
2022-11-22 18:16:04 +01:00
Frank Denis
c45c6cd492 Add the POLYVAL universal hash function
POLYVAL is GHASH's little brother, required by the AES-GCM-SIV
construction. It's defined in RFC8452.

The irreducible polynomial is a mirror of GHASH's (which doesn't
change anything in our implementation that didn't reverse the raw
bits to start with).

But most importantly, POLYVAL encodes byte strings as little-endian
instead of big-endian, which makes it a little bit faster on the
vast majority of modern CPUs.

So, both share the same code, just with comptime magic to use the
correct endianness and only double the key for GHASH.
2022-11-20 18:13:19 -05:00
Frank Denis
4dd061a7ac ghash: handle the .hi_lo case when no CLMUL acceleration is present, too 2022-11-17 23:54:21 +01:00
Frank Denis
3051e279a5 Reapply "std.crypto.onetimeauth.ghash: faster GHASH on modern CPUs (#13566)"
This reapplies commit 72d3f4b5dc.
2022-11-17 23:52:58 +01:00
Andrew Kelley
72d3f4b5dc Revert "std.crypto.onetimeauth.ghash: faster GHASH on modern CPUs (#13566)"
This reverts commit 7cfeae1ce7 which
is causing std lib tests to fail on wasm32-wasi.
2022-11-17 15:37:37 -07:00
Frank Denis
7cfeae1ce7
std.crypto.onetimeauth.ghash: faster GHASH on modern CPUs (#13566)
* std.crypto.onetimeauth.ghash: faster GHASH on modern CPUs

Carryless multiplication was slow on older Intel CPUs, justifying
the need for using Karatsuba multiplication.

This is not the case any more; using 4 multiplications to multiply
two 128-bit numbers is actually faster than 3 multiplications +
shifts and additions.

This is also true on aarch64.

Keep using Karatsuba only when targeting x86 (granted, this is a bit
of a brutal shortcut, we should really list all the CPU models that
had a slow clmul instruction).

Also remove useless agg_2 treshold and restore the ability to
precompute only H and H^2 in ReleaseSmall.

Finally, avoid using u256. Using 128-bit registers is actually faster.

* Use a switch, add some comments
2022-11-17 13:07:07 +01:00
Frank Denis
7eed028f9a
crypto.bcrypt: fix massive speed regression when using stage2 (#13518)
state: State -> state: *const State
Suggested by @nektro

Fixes #13510
2022-11-14 16:37:19 +01:00
Naoki MATSUMOTO
b29057b6ab
std.crypto.ghash: fix uninitialized polynomial use (#13527)
In the process of 'remaining blocks',
the length of processed message can be from 1 to 79.
The value of 'n-1' is ranged from 0 to 3.
So, st.hx[i] must be initialized at least from st.hx[0] to st.hx[3]
2022-11-14 16:35:08 +01:00
Frank Denis
df7223c7f2 crypto.AesGcm: provision ghash for the final block 2022-11-11 18:04:22 +01:00
Frank Denis
59af6417bb
crypto.ghash: define aggregate tresholds as blocks, not bytes (#13507)
These constants were read as a block count in initForBlockCount()
but at the same time, as a size in update().

The unit could be blocks or bytes, but we should use the same one
everywhere.

So, use blocks as intended.

Fixes #13506
2022-11-10 19:00:00 +01:00
Frank Denis
36e618aef1 crypto.ghash: compatibility with stage1
Defining the selector enum outside the function definition is
required for stage1.
2022-11-08 16:59:53 +01:00
Frank Denis
7d48cb1138
std.crypto: make ghash faster, esp. for small messages (#13464)
* std.crypto: make ghash faster, esp. for small messages

Aggregated reduction requires 5 additional multiplications (to
precompute the powers of H), in order to save 2 multiplications
per batch.

So, only use large batches when it's actually interesting to do so.

For the last blocks, reuse the precomputations in order to perform
a single reduction.

Also, even in .ReleaseSmall, allow 2-block aggregation.
The speedup is worth it, and the code increase is reasonable.

And in .ReleaseFast, bump the upper batch size up to 16.

Leverage comptime by the way instead of duplicating code.

std/crypto/benchmark.zig on Apple M1:

    Zig 0.10.0: 2769 MiB/s
        Before: 6014 MiB/s
         After: 7334 MiB/s

Normalize function names by the way.

* Change clmul() to accept the half to be processed

This avoids a bunch of truncate() calls.

* Add more ghash tests to check all code paths
2022-11-07 21:45:29 +01:00
Frank Denis
32563e6829
crypto.core.aes: process 6 block in parallel instead of 8 on aarch64 (#13473)
* crypto.core.aes: process 6 block in parallel instead of 8 on aarch64

At least on Apple Silicon, this is slightly faster than 8 blocks.

* AES: add parallel blocks for tigerlake, rocketlake, alderlake, zen3
2022-11-07 12:28:37 +01:00
Frank Denis
907f3ef887
crypto.salsa20: make the number of rounds a comptime parameter (#13442)
...instead of hard-coding it to 20.

- This is consistent with the ChaCha implementation
- NaCl and libsodium, that this API is designed to interop with,
also support 8 and 12 round variants. The 12 round variant, in
particular, provides the same security level as the 20 round variant,
but is obviously faster.
- scrypt currently uses its own non optimized version of Salsa, just
because it use 8 rounds instead of 20. This will help remove code
duplication.

No behavior nor public API changes. The Salsa20 and XSalsa20 still
represent the 20-round variant.
2022-11-06 23:52:41 +01:00