At the moment, the LLVM IR we generate for this fn is
define internal fastcc void @AstGen.numberLiteral ... {
Entry:
...
%16 = alloca %"fmt.parse_float.decimal.Decimal(f128)", align 8
...
That `Decimal` is huuuge! It stores
pub const max_digits = 11564;
digits: [max_digits]u8,
on the stack.
It comes from `convertSlow` function, which LLVM happily inlined,
despite it being the cold path. Forbid inlining that to not penalize
callers with excessive stack usage.
Backstory: I was looking for needles memcpys in TigerBeetle, and came up
with this copyhound.zig tool for doing just that:
ee67e2ab95/src/copyhound.zig
Got curious, run it on the Zig's own code base, and looked at some of
the worst offenders.
List of worst offenders:
warning: crypto.kyber_d00.Kyber.SecretKey.decaps: 7776 bytes memcpy
warning: crypto.ff.Modulus.powPublic: 8160 bytes memcpy
warning: AstGen.numberLiteral: 11584 bytes memcpy
warning: crypto.tls.Client.init__anon_133566: 13984 bytes memcpy
warning: http.Client.connectUnproxied: 16896 bytes memcpy
warning: crypto.tls.Client.init__anon_133566: 16904 bytes memcpy
warning: objcopy.ElfFileHelper.tryCompressSection: 32768 bytes memcpy
Note from Andrew: I removed `noinline` from this commit since it should
be enough to set it to be cold.
Instead, we now have a looser helper called `checkContains(...)`
that will match on any occurrence similarly to `std.mem.indexOf()`.
While at it, I have cleaned up other combinators to make the entire
API more consistent, and so:
* `checkStart(phrase)` is now `checkStart()` followed by
`checkExact(phrase)`
* `checkNext(phrase)` if matching exactly is now `checkExact(phrase)`
* `checkNext(phrase)` if matching loosely is now `checkContains(phrase)`
* `checkNext(phrase)` if matching exactly with var extractors is now
`checkExtract(phrase)`
Finally, `ElfDumper` is now dumping contents of `.symtab` and `.dynsym`
symbol tables. I have also removed dumping of symtabs as optional - they
are now always dumped which cleaned up the implementation even more.
It is assumed that generating a collision requires more than 2^156
ciphertext modifications. This is plenty enough for any practical
purposes, but it hasn't been proven to be >= 2^256.
Be consistent and conservative here; just claim the same security
as the other variants.
* move inferred error sets into InternPool.
- they are now represented by pointing directly at the corresponding
function body value.
* inferred error set working memory is now in Sema and expires after
the Sema for the function corresponding to the inferred error set is
finished having its body analyzed.
* error sets use a InternPool.Index.Slice rather than an actual slice
to avoid lifetime issues.
The code removed does unnecessary copying in order to create a null-terminated pointer, just to pass it to libc getenv. It only does this for `small keys`, which are under 64 bytes in size.
Instead of going out of the way to add a null byte to a function that takes normal slices, this should just be handled by the loop below, which scans c.environ to find the value
And when we have the choice, favor little-endian because it's 2023.
Gives a slight performance improvement:
md5: 552 -> 555 MiB/s
sha1: 768 -> 786 MiB/s
sha512: 211 -> 217 MiB/s
* Small documentation fix of ChaCha variants
Previous documentation was seemingly copy-pasted and left
behind some errors where the number of rounds was not
properly updated.
* Suggest `std.crypto.utils.secureZero` on `@memset` docs
* Revert previous change
The 'at least' behavior of the Allocator interface was removed in #13666, so anything that used reallocAtLeast or the .at_least Exact behavior could still have doc comments that reference no-longer-true behavior.
Funnily enough, ArrayList is the only place that used this functionality (outside of allocator test cases), so its doc comments are the only things that need to be fixed. This was checked by resetting to deda6b5146 and searching for all instances of `reallocAtLeast` and `.at_least` (one of which would need to be used to get the `.at_least` behavior)
Don't pass the object files from a static library to the linker invocation.
The lib.a file already contains them.
Avoids "duplicate symbol" errors (and useless work by the linker)
I tested this and this definitely compiles and these
changes were done programmatically but if there's still anything wrong
it shouldn't be hard to fix.
With this change it's going to be very easy to make further adjustments
to the calling conventions of all these external UEFI functions.
Closes#16309