* Introduce common `bzero` libc implementation.
* Update test name according to review
Co-authored-by: Linus Groh <mail@linusgroh.de>
* address code review
- import common implementation when musl or wasi is included
- don't use `c_builtins`, use `@memset`
* bzero calling conv to .c
* Apply review
Co-authored-by: Veikka Tuominen <git@vexu.eu>
---------
Co-authored-by: Linus Groh <mail@linusgroh.de>
Co-authored-by: Veikka Tuominen <git@vexu.eu>
This lays the groundwork for #2879. This library will be built and linked when a
static libc is going to be linked into the compilation. Currently, that means
musl, wasi-libc, and MinGW-w64. As a demonstration, this commit removes the musl
C code for a few string functions and implements them in libzigc. This means
that those libzigc functions are now load-bearing for musl and wasi-libc.
Note that if a function has an implementation in compiler-rt already, libzigc
should not implement it. Instead, as we recently did for memcpy/memmove, we
should delete the libc copy and rely on the compiler-rt implementation.
I repurposed the existing "universal libc" code to do this. That code hadn't
seen development beyond basic string functions in years, and was only usable-ish
on freestanding. I think that if we want to seriously pursue the idea of Zig
providing a freestanding libc, we should do so only after defining clear goals
(and non-goals) for it. See also #22240 for a similar case.
Unfortunately some duplicate files must remain in lib/libc/wasi/libc-top-half
because they include internal headers *in the same directory* which have edits
relative to upstream musl. Because C is an amazing language, there is no way to
make it so that e.g. upstream musl's src/stdio/fputc.c includes wasi-libc's
src/stdio/putc.h instead of the upstream putc.h. The preprocessor always
searches the current directory first for quote includes.
Anyway, this still takes us from 2.9M to 1.4M for the combination of
lib/libc/wasi and lib/libc/include/wasm-wasi-musl, so I still call it a win.
crti.o/crtn.o is a legacy strategy for calling constructor functions
upon object loading that has been superseded by the
init_array/fini_array mechanism.
Zig code depends on neither, since the language intentionally has no way
to initialize data at runtime, but alas the Zig linker still must
support this feature since popular languages depend on it.
Anyway, the way it works is that crti.o has the machine code prelude of
two functions called _init and _fini, each in their own section with the
respective name. crtn.o has the machine code instructions comprising the
exitlude for each function. In between, objects use the .init and .fini
link section to populate the function body.
This function is then expected to be called upon object initialization
and deinitialization.
This mechanism is depended on by libc, for example musl and glibc, but
only for older ISAs. By the time the libcs gained support for newer
ISAs, they had moved on to the init_array/fini_array mechanism instead.
For the Zig linker, we are trying to move the linker towards
order-independent objects which is incompatible with the legacy
crti/crtn mechanism.
Therefore, this commit drops support entirely for crti/crtn mechanism,
which is necessary since the other commits in this branch make it
nondeterministic in which order the libc objects and the other link
inputs are sent to the linker.
The linker is still expected to produce a deterministic output, however,
by ignoring object input order for the purposes of symbol resolution.
This is, at least today, a very broken target: It doesn't actually build either
musl or wasi-libc even if you use -lc. It does give you musl headers, but that's
it. Those headers are not terribly useful, however, without any implementation
code. You can sort of call some math functions because they just so happen to
have implementations in compiler-rt. But that's only true for a small subset,
and I don't think users should be relying on the ABI surface of a library that
is an implementation detail of the compiler.
Clearly, a freestanding-capable libc of sorts is a useful thing as evidenced by
newlib, picolibc, etc existing. However, calling it "musl" is misleading when it
isn't actually musl-compatible, nor can it ever be because the musl API surface
is inextricably tied to the Linux kernel. In the discussion on #20690, there was
agreement that once we split up the API and ABI components in the target string,
the API component should be about compatibility, not whether you literally get a
particular implementation of it. Also, we decided that Linux musl and wasi-libc
musl shouldn't use the same API tag precisely because they're not actually
compatible.
(And besides, how would any syscall even be implemented in freestanding? Who or
what would we be calling?)
So I think we should remove this triple for now. If we decide to reintroduce
something like this, especially once #2879 gets going, we should come up with a
bespoke name for it rather than using "musl".
Whatever was in the frame pointer register prior to clone() will no longer be
valid in the child process, so zero it to protect FP-based unwinders. This is
just an extension of what was already done for i386 and x86_64. Only applied
to architectures where the _start() code also zeroes the frame pointer.