Commit graph

5919 commits

Author SHA1 Message Date
Frank Denis
32563e6829
crypto.core.aes: process 6 block in parallel instead of 8 on aarch64 (#13473)
* crypto.core.aes: process 6 block in parallel instead of 8 on aarch64

At least on Apple Silicon, this is slightly faster than 8 blocks.

* AES: add parallel blocks for tigerlake, rocketlake, alderlake, zen3
2022-11-07 12:28:37 +01:00
Frank Denis
907f3ef887
crypto.salsa20: make the number of rounds a comptime parameter (#13442)
...instead of hard-coding it to 20.

- This is consistent with the ChaCha implementation
- NaCl and libsodium, that this API is designed to interop with,
also support 8 and 12 round variants. The 12 round variant, in
particular, provides the same security level as the 20 round variant,
but is obviously faster.
- scrypt currently uses its own non optimized version of Salsa, just
because it use 8 rounds instead of 20. This will help remove code
duplication.

No behavior nor public API changes. The Salsa20 and XSalsa20 still
represent the 20-round variant.
2022-11-06 23:52:41 +01:00
r00ster91
b83e4d9656 std.os.linux.T: translate more MIPS values
This fixes the broken terminal for me and thus fixes #13198.
2022-11-06 16:33:50 +02:00
Jay Petacat
694d8831c3 Revert "x86" CPU model (not arch) back to "i386"
PR #13101 recently renamed the "i386" architecture to "x86", and it
seems the specific CPU model got swept up in that. "x86" is an umbrella
term that describes a family of CPUs, and the "i386" is the oldest
supported model under that umbrella.
2022-11-06 13:39:03 +01:00
delitako
c8a5ad6d9d
Improve doc comments for two functions (#13456)
* std.heap: fix function name in doc comment

* std.mem: document return value of `replace`
2022-11-05 22:55:50 +01:00
Ali Chraghi
aea617c60b std.os: take advantage of the freebsd's copy_file_range 2022-11-05 15:43:39 -04:00
Jakub Konka
83d89a05b7 coff: compile and link simple exit program on arm64
* make image base target dependent
* fix relocs to imports
2022-11-05 10:15:01 +01:00
Andrew Kelley
1d68045919
Merge pull request #13101 from alichraghi/o4 2022-11-05 02:34:24 -04:00
Veikka Tuominen
8c4faa5f3f
Merge pull request #13338 from Vexu/stage2-compile-errors
Improve some error messages
2022-11-04 16:04:31 +02:00
Ali Chraghi
f5f1f8c666 all: rename i386 to x86 2022-11-04 00:09:27 +03:30
Yujiri
b19161ba9c Add docstrings to some functions in std.meta 2022-11-03 17:37:38 +02:00
Veikka Tuominen
678f3f6e65
Merge pull request #13276 from r00ster91/stem
std.fs.path: add stem()
2022-11-03 16:45:37 +02:00
Frank Denis
96793530cd
std.crypto.pwhash.bcrypt: inline the Feistel network function (#13416)
std/crypto/benchmark.zig results:

* Intel i5

before: 3.144 s/ops
 after: 1.922 s/ops

* Apple M1

before: 2.067 s/ops
 after: 1.373 s/ops
2022-11-03 13:10:08 +01:00
Nathan Bourgeois
e64eef366c
Translate-C Remainder Macro Fix 2022-11-03 14:07:00 +02:00
Eric Milliken
b40fc70188
std.time: add microTimestamp() (#13327) 2022-11-02 23:20:19 +01:00
Jacob Young
771dadc5e0 x86: cleanup inline asm
Multiple outputs work now, so use that instead of deleting clobbers.
2022-11-01 20:39:06 -04:00
Jacob Young
9b2db56d0f x86: remove inline asm clobbers that conflict with inputs or outputs 2022-11-01 20:39:06 -04:00
Jacob Young
757db665a7 build: remove ofmt from LibExeObjStep which is redundant with target.ofmt 2022-11-01 20:38:37 -04:00
Jacob Young
93d60d0de7 std: avoid vector usage with the C backend
Vectors are not yet implemented in the C backend, so no reason to
prevent code using the standard library from compiling in the meantime.
2022-11-01 20:38:37 -04:00
Frank Denis
0d192ee9ef
std.crypto.onetimeauth.Ghash: make GHASH 2 - 2.5x faster (#13374)
Rewrite GHASH to use 128-bit multiplication over non-reversed
integers, and up to 8 blocks aggregated reduction.

lib/std/crypto/benchmark.zig results:

Xeon E5:
  Before: 1604 MiB/s
   After: 4005 MiB/s

Apple M1:
  Before: 2769 MiB/s
   After: 6014 MiB/s

This also makes AES-GCM faster by the way.
2022-11-01 13:49:13 -04:00
Andrew Kelley
1780d7a348
Merge pull request #13368 from jacobly0/fix-aarch64-c 2022-11-01 13:28:40 -04:00
mnordine
2943df016e
Fix variable name in documentation sample (#13391) 2022-11-01 12:49:13 +01:00
Frank Denis
ddb9eac05c ed25519: recommend using the seed to recover a key pair 2022-11-01 07:26:32 +01:00
Frank Denis
9e44710fc4
Ed25519.KeyPair.fromSecretKey() didn't compile after the API changes (#13386)
Fixes #13378
2022-11-01 07:10:40 +01:00
Jacob Young
b945d3eb90 cbe: improve support for non-native float types
* Fix _start on aarch64.
 * Add fallbacks when a float type is unsupported.

Fixes #13357
2022-10-31 20:18:15 -04:00
Andrew Kelley
d02242661e std.heap: make wasm32 PageAllocator handle large allocation
When a number of bytes to be allocated is so great that alignForward()
is not possible, return `error.OutOfMemory`.

Companion commit to 3f3003097c.
2022-10-30 19:28:26 -07:00
Andrew Kelley
3f3003097c std.heap.PageAllocator: add check for large allocation
Instead of making the memory alignment functions more complicated, I
added more API documentation for their existing semantics.

closes #12118
closes #12135
2022-10-30 16:10:20 -07:00
Nameless
40e84a27d6
change uefi packed structs to new integer backed syntax (#13173)
* std.os.uefi: integer backed structs, add tests to catch regressions

device_path_protocol now uses extern structs with align(1) fields because
the transition to integer backed packed struct broke alignment

added comptime asserts that device_path_protocol structs do not violate
alignment and size specifications
2022-10-30 15:08:32 -04:00
Andrew Kelley
5339cac2ef std: re-enable auto hash test
It was temporarily disabled due to a self-hosted compiler
miscompilation.

closes #12178
2022-10-30 01:09:31 -07:00
Andrew Kelley
209a0d2a83
Merge pull request #13153 from squeek502/iterator-filename-limits
Windows: Fix Iterator name buffer size not handling all possible file name components
2022-10-29 23:50:37 -04:00
Andrew Kelley
5f5a20ebaf
Merge pull request #13093 from jacobly0/backend-fixes
C backend fixes
2022-10-29 23:06:59 -04:00
kkHAIKE
949cca9f2a Allocator: fix len_align calc in large type size case 2022-10-29 17:53:56 -04:00
Ryan Liptak
db80225a97 fs: Some NAME_MAX/MAX_NAME_BYTES improvements 2022-10-29 14:30:46 -07:00
Ryan Liptak
348f73502e Make MAX_NAME_BYTES on WASI equivalent to the max of the other platforms
Make the test use the minimum length and set MAX_NAME_BYTES to the maximum so that:
- the test will work on any host platform
- *and* the MAX_NAME_BYTES will be able to hold the max file name component on any host platform
2022-10-29 14:30:46 -07:00
Ryan Liptak
c5d23161fc Set wasi MAX_NAME_BYTES to minimum of the rest of the supported platforms
This is a slightly weird situation, because the 'real' value may depend on the host platform that the WASI is being executed on.
2022-10-29 14:30:45 -07:00
Ryan Liptak
dd0962d5ea Add wasi branch to MAX_NAME_BYTES 2022-10-29 14:30:45 -07:00
Ryan Liptak
c6ff1a7160 Windows: Fix iterator name buffer size not handling all possible file name components
Each u16 within a file name component can be encoded as up to 3 UTF-8 bytes, so we need to use MAX_NAME_BYTES to account for all possible UTF-8 encoded names.

Fixes #8268
2022-10-29 14:30:44 -07:00
Ryan Liptak
33fdc43714 std.fs: Add MAX_NAME_BYTES
Also add some NAME_MAX or equivalent definitions where necessary
2022-10-29 14:30:43 -07:00
fn ⌃ ⌥
a70c86e661 Fix deprecation docs for isAlpha and isCntrl 2022-10-29 15:22:05 -04:00
Veikka Tuominen
278c32976e parser: add helpful error for extra = in variable initializer
Closes #12768
2022-10-29 14:55:43 +03:00
Veikka Tuominen
d7314555f2 Sema: improve compile error for casting double pointer to anyopaque pointer
Closes #12042
2022-10-29 14:55:43 +03:00
Veikka Tuominen
9607bd90e6 parser: improve error message for missing var/const before local variable
Closes #12721
2022-10-29 14:55:43 +03:00
Jacob Young
48a2783969 cbe: implement optional slice representation change 2022-10-29 05:58:41 -04:00
Andrew Kelley
20925b2f5c
Merge pull request #13272 from topolarity/sha2-intrinsics
crypto.sha2: Use intrinsics for SHA-256 on x86-64 and AArch64
2022-10-29 03:31:42 -04:00
Andrew Kelley
c36eb4ede9
Merge pull request #13221 from topolarity/packed-mem
Introduce `std.mem.readPackedInt` and improve bitcasting of packed memory layouts
2022-10-28 21:15:16 -04:00
Zhora Trush
c66d3f6bf6 Enhance indexOfIgnoreCase with Boyer-Moore-Horspool algorithm 2022-10-28 20:29:41 -04:00
Cody Tapscott
67fa3262b1 std.crypto: Use featureSetHas to gate intrinsics
This also fixes a bug where the feature gating was not taking
effect at comptime due to https://github.com/ziglang/zig/issues/6768
2022-10-28 17:17:08 -07:00
Cody Tapscott
f9fe548e41 std.crypto: Add isComptime guard around intrinsics
Comptime code can't execute assembly code, so we need some way to
force comptime code to use the generic path. This should be replaced
with whatever is implemented for #868, when that day comes.

I am seeing that the result for the hash is incorrect in stage1 and
crashes stage2, so presumably this never worked correctly. I will follow
up on that soon.
2022-10-28 15:21:10 -07:00
Cody Tapscott
4c1f71e866 std.crypto: Optimize SHA-256 intrinsics for AMD x86-64
This gets us most of the way back to the performance I had when
I was using the LLVM intrinsics:
  - Intel Intel(R) Core(TM) i7-1068NG7 CPU @ 2.30GHz:
       190.67 MB/s (w/o intrinsics) -> 1285.08 MB/s
  - AMD EPYC 7763 (VM) @ 2.45 GHz:
       240.09 MB/s (w/o intrinsics) -> 1360.78 MB/s
  - Apple M1:
       216.96 MB/s (w/o intrinsics) -> 2133.69 MB/s

Minor changes to this source can swing performance from 400 MB/s to
1400 MB/s or... 20 MB/s, depending on how it interacts with the
optimizer. I have a sneaking suspicion that despite LLVM inheriting
GCC's extremely strict inline assembly semantics, its passes are
rather skittish around inline assembly (and almost certainly, its
instruction cost models can assume nothing)
2022-10-28 15:21:10 -07:00
Cody Tapscott
ee241c47ee std.crypto: SHA-256 Properly gate comptime conditional
This feature detection must be done at comptime so that we avoid
generating invalid ASM for the target.
2022-10-28 15:21:10 -07:00