Commit graph

36077 commits

Author SHA1 Message Date
Ryan Liptak
59b8bed222 Teach fs.path about the wonderful world of Windows paths
Previously, fs.path handled a few of the Windows path types, but not all of them, and only a few of them correctly/consistently. This commit aims to make `std.fs.path` correct and consistent in handling all possible Win32 path types.

This commit also slightly nudges the codebase towards a separation of Win32 paths and NT paths, as NT paths are not actually distinguishable from Win32 paths from looking at their contents alone (i.e. `\Device\Foo` could be an NT path or a Win32 rooted path, no way to tell without external context). This commit formalizes `std.fs.path` being fully concerned with Win32 paths, and having no special detection/handling of NT paths.

Resources on Windows path types, and Win32 vs NT paths:

- https://googleprojectzero.blogspot.com/2016/02/the-definitive-guide-on-win32-to-nt.html
- https://chrisdenton.github.io/omnipath/Overview.html
- https://learn.microsoft.com/en-us/windows/win32/fileio/naming-a-file

API additions/changes/deprecations

- `std.os.windows.getWin32PathType` was added (it is analogous to `RtlDetermineDosPathNameType_U`), while `std.os.windows.getNamespacePrefix` and `std.os.windows.getUnprefixedPathType` were deleted. `getWin32PathType` forms the basis on which the updated `std.fs.path` functions operate.
- `std.fs.path.parsePath`, `std.fs.path.parsePathPosix`, and `std.fs.path.parsePathWindows` were added, while `std.fs.path.windowsParsePath` was deprecated. The new `parsePath` functions provide the "root" and the "kind" of a path, which is platform-specific. The now-deprecated `windowsParsePath` did not handle all possible path types, while the new `parsePathWindows` does.
- `std.fs.path.diskDesignator` has been deprecated in favor of `std.fs.path.parsePath`, and same deal with `diskDesignatorWindows` -> `parsePathWindows`
- `relativeWindows` is now a compile error when *not* targeting Windows, while `relativePosix` is now a compile error when targeting Windows. This is because those functions read/use the CWD path which will behave improperly when used from a system with different path semantics (e.g. calling `relativePosix` from a Windows system with a CWD like `C:\foo\bar` will give you a bogus result since that'd be treated as a single relative component when using POSIX semantics). This also allows `relativeWindows` to use Windows-specific APIs for getting the CWD and environment variables to cut down on allocations.
- `componentIterator`/`ComponentIterator.init` have been made infallible. These functions used to be able to error on UNC paths with an empty server component, and on paths that were assumed to be NT paths, but now:
  + We follow the lead of `RtlDetermineDosPathNameType_U`/`RtlGetFullPathName_U` in how it treats a UNC path with an empty server name (e.g. `\\\share`) and allow it, even if it'll be invalid at the time of usage
  + Now that `std.fs.path` assumes paths are Win32 paths and not NT paths, we don't have to worry about NT paths

Behavior changes

- `std.fs.path` generally: any combinations of mixed path separators for UNC paths are universally supported, e.g. `\/server/share`, `/\server\share`, `/\server/\\//share` are all seen as equivalent UNC paths
- `resolveWindows` handles all path types more appropriately/consistently.
  + `//` and `//foo` used to be treated as a relative path, but are now seen as UNC paths
  + If a rooted/drive-relative path cannot be resolved against anything more definite, the result will remain a rooted/drive-relative path.
  + I've created [a script to generate the results of a huge number of permutations of different path types](https://gist.github.com/squeek502/9eba7f19cad0d0d970ccafbc30f463bf) (the result of running the script is also included for anyone that'd like to vet the behavior).
- `dirnameWindows` now treats the drive-relative root as the dirname of a drive-relative path with a component, e.g. `dirname("C:foo")` is now `C:`, whereas before it would return null. `dirnameWindows` also handles local device paths appropriately now.
- `basenameWindows` now handles all path types more appropriately. The most notable change here is `//a` being treated as a partial UNC path now and therefore `basename` will return `""` for it, whereas before it would return `"a"`
- `relativeWindows` will now do its best to resolve against the most appropriate CWD for each path, e.g. relative for `D:foo` will look at the CWD to check if the drive letter matches, and if not, look at the special environment variable `=D:` to get the shell-defined CWD for that drive, and if that doesn't exist, then it'll resolve against `D:\`.

Implementation details

- `resolveWindows` previously looped through the paths twice to build up the relevant info before doing the actual resolution. Now, `resolveWindows` iterates backwards once and keeps track of which paths are actually relevant using a bit set, which also allows it to break from the loop when it's no longer possible for earlier paths to matter.
- A standalone test was added to test parts of `relativeWindows` since the CWD resolution logic depends on CWD information from the PEB and environment variables

Edge cases worth noting

- A strange piece of trivia that I found out while working on this is that it's technically possible to have a drive letter that it outside the intended A-Z range, or even outside the ASCII range entirely. Since we deal with both WTF-8 and WTF-16 paths, `path[0]`/`path[1]`/`path[2]` will not always refer to the same bits of information, so to get consistent behavior, some decision about how to deal with this edge case had to be made. I've made the choice to conform with how `RtlDetermineDosPathNameType_U` works, i.e. treat the first WTF-16 code unit as the drive letter. This means that when working with WTF-8, checking for drive-relative/drive-absolute paths is a bit more complicated. For more details, see the lengthy comment in `std.os.windows.getWin32PathType`
- `relativeWindows` will now almost always be able to return either a fully-qualified absolute path or a relative path, but there's one scenario where it may return a rooted path: when the CWD gotten from the PEB is not a drive-absolute or UNC path (if that's actually feasible/possible?). An alternative approach to this scenario might be to resolve against the `HOMEDRIVE` env var if available, and/or default to `C:\` as a last resort in order to guarantee the result of `relative` is never a rooted path.
- Partial UNC paths (e.g. `\\server` instead of `\\server\share`) are a bit awkward to handle, generally. Not entirely sure how best to handle them, so there may need to be another pass in the future to iron out any issues that arise. As of now the behavior is:
  + For `relative`, any part of a UNC disk designator is treated as the "root" and therefore isn't applicable for relative paths, e.g. calling `relative` with `\\server` and `\\server\share` will result in `\\server\share` rather than just `share` and if `relative` is called with `\\server\foo` and `\\server\bar` the result will be `\\server\bar` rather than `..\bar`
  + For `resolve`, any part of a UNC disk designator is also treated as the "root", but relative and rooted paths are still elligable for filling in missing portions of the disk designator, e.g. `resolve` with `\\server` and `foo` or `\foo` will result in `\\server\foo`

Fixes #25703
Closes #25702
2025-11-21 00:03:44 -08:00
Alex Rønne Petersen
f3eef35c05
aro: unbreak s390x
https://github.com/ziglang/zig/pull/25780#discussion_r2548496117
2025-11-21 06:28:19 +01:00
rpkak
6b4f45f782 system specific errno 2025-11-20 15:03:23 -08:00
Benjamin Jurk
4b5351bc0d
update deprecated ArrayListUnmanaged usage (#25958) 2025-11-20 14:46:23 -08:00
Andrew Kelley
db622f14c4
Merge pull request #25780 from Vexu/translate-c
Update Aro and translate-c to latest
2025-11-20 10:24:31 -08:00
Matthew Lugg
8a73fc8d8e
Merge pull request #25981 from mlugg/macos-fuzz-2
make the fuzzer vaguely work on macOS
2025-11-20 17:48:35 +00:00
Andrew Kelley
a9568ed296
Merge pull request #25898 from jacobly0/elfv2-progress
Elf2: more progress
2025-11-20 04:33:04 -08:00
Veikka Tuominen
df50f9e28e update resinator to Aro changes 2025-11-20 13:12:53 +02:00
Veikka Tuominen
21f3ff2a8d update Aro and translate-c to latest 2025-11-20 13:12:53 +02:00
Matthew Lugg
a87b533231
std.Io.Writer: fix some bugs 2025-11-20 10:42:21 +00:00
Matthew Lugg
b05fefb9c9
std.http: stop assuming previous chunk state
The full file may not be written, either due to a previous chunk being
in-progress when `sendFile` was called, or due to `limit`.
2025-11-20 10:42:21 +00:00
Matthew Lugg
bc524a2b1a
std.Build: fix crashes running fuzz tests 2025-11-20 10:42:21 +00:00
Matthew Lugg
0f06b5b583
std.debug.MachOFile: handle 'path/to/archive.a(entry.o)' form 2025-11-20 10:42:21 +00:00
Matthew Lugg
e1fa4011fb
fuzz: hack around unknown module structure 2025-11-20 10:42:20 +00:00
Matthew Lugg
010dcd6a9b
fuzzer: account for runtime address slide
This is relevant to PIEs, which are notably enabled by default on macOS.
The build system needs to only see virtual addresses, that is, those
which do not have the slide applied; but the fuzzer itself naturally
sees relocated addresses (i.e. with the slide applied). We just need to
subtract the slide when we communicate addresses to the build system.
2025-11-20 10:42:20 +00:00
Matthew Lugg
0a330d4f94
std.debug.Info: basic Mach-O support 2025-11-20 10:42:20 +00:00
Matthew Lugg
0caca625eb
std.debug: split up Mach-O debug info handling
Like ELF, we now have `std.debug.MachOFile` for the host-independent
parts, and `std.debug.SelfInfo.MachO` for logic requiring the file to
correspond to the running program.
2025-11-20 10:42:20 +00:00
Jacob Young
7b325e08c9 Elf2: revert incorrect endian fix 2025-11-19 22:19:57 -05:00
Alex Rønne Petersen
f06adc70b0
ci: bump timeout for s390x-linux-debug to 5 hours on Forgejo Actions 2025-11-20 03:34:42 +01:00
Alex Rønne Petersen
c9afa901f3
ci: bump timeout for s390x-linux-release to 4 hours on Forgejo Actions 2025-11-19 23:47:58 +01:00
Alex Rønne Petersen
5078acf3a3
std.Io.net: disable listen on a unix socket, send bytes, receive bytes on Windows
https://github.com/ziglang/zig/issues/25983
2025-11-19 21:51:57 +01:00
Alex Rønne Petersen
4ed5bb0bac
ci: bump aarch64-linux-debug timeout to 3 hours on Forgejo Actions 2025-11-19 20:45:36 +01:00
Harold
2f240d0819 std.Build.Step.Compile: add support for '-z defs' flag 2025-11-19 20:13:54 +01:00
Jan200101
bd832ed39a std.Io.Threaded: add missing statx masks
statx does not guarantee that the values requested by the mask be
present and those not requested be absent which is why this worked.
2025-11-19 20:13:25 +01:00
Alex Rønne Petersen
43371cf388
Merge pull request #25965 from alexrp/s390x
`s390x-linux` and general big-endian stuff
2025-11-19 19:52:18 +01:00
Ryan Liptak
26afcdb7fe std.process: Actually use explicit GetCwdError/GetCwdAllocError sets
Also fix GetCwdAllocError to include only the set of possible errors.
2025-11-19 04:10:11 -08:00
Matthew Lugg
806470b492
compiler: fix crash if file contents change during update
When reporting a compile error, we would load the new file, but assume
we could apply old AST/token indices (etc) to it, potentially causing
crashes. Instead, if the file stat has changed since it was loaded, just
emit an error that the file was modified mid-update.
2025-11-19 09:44:22 +00:00
Alex Rønne Petersen
abd05b3819
ci: enable s390x-linux jobs on Forgejo Actions 2025-11-19 09:41:58 +01:00
Alex Rønne Petersen
830831dcba
ci: add s390x-linux scripts 2025-11-19 09:41:55 +01:00
Ryan Liptak
fb1bd78908 process.getenvW: Document that returned memory points to the PEB 2025-11-18 20:07:39 -08:00
Alex Rønne Petersen
e11901c1a1
test: enable C ABI tests on s390x-linux 2025-11-19 01:42:45 +01:00
Alex Rønne Petersen
692c798303
test: disable a bunch of failing C ABI tests on s390x 2025-11-19 01:42:45 +01:00
Alex Rønne Petersen
ae0cc8c065
test: disable C ABI tests using i128 on s390x due to an LLVM crash
https://github.com/llvm/llvm-project/issues/168460
2025-11-19 01:42:45 +01:00
Alex Rønne Petersen
a8e77b7a05
test: disable test-link on big-endian hosts
https://github.com/ziglang/zig/issues/25961
2025-11-19 01:42:45 +01:00
Alex Rønne Petersen
a0e6d41331
test: remove complex arithmetic testing from c_compiler standalone test
This has no business being here. Tests for our compiler-rt routines should be in
compiler-rt, and tests for our C ABI compliance should be in `test-c-abi`.
2025-11-19 01:42:45 +01:00
Alex Rønne Petersen
41f7f3d4d5
test: disable an error trace test on optimized s390x-linux 2025-11-19 01:42:45 +01:00
Alex Rønne Petersen
0d104a645a
test: fix glibc_compat test for s390x which does not have local atexit 2025-11-19 01:42:45 +01:00
Alex Rønne Petersen
508f676bb4
std.os.linux.IoUring: disable bind/listen/connect on s390x
https://github.com/ziglang/zig/issues/25956
2025-11-19 01:42:45 +01:00
Alex Rønne Petersen
e179335bee
std.zon.parse: disable zon vector on s390x
https://github.com/ziglang/zig/issues/25957
2025-11-19 01:42:45 +01:00
Alex Rønne Petersen
c82884542f
langref: work around s390x LLVM compilation crash in test_defining_variadic_function
https://github.com/ziglang/zig/issues/21350#issuecomment-3543006475
2025-11-19 01:42:45 +01:00
Alex Rønne Petersen
4dd05b5a64
langref: work around s390x LLVM miscompilation in runtime_shrExact_overflow
https://github.com/ziglang/zig/issues/24304
2025-11-19 01:42:45 +01:00
Alex Rønne Petersen
04853f4052
std-docs: read/write messages as little endian 2025-11-19 01:42:45 +01:00
Alex Rønne Petersen
8817fc8958
incr-check: read/write messages as little endian 2025-11-19 01:42:45 +01:00
Alex Rønne Petersen
acfb88e9a5
std.Build.Step.CheckObject: make ELF reading endianness-aware 2025-11-19 01:42:45 +01:00
Alex Rønne Petersen
e09ba67161
std.Build.Step.Run: read/write messages as little endian 2025-11-19 01:42:45 +01:00
Matthew Lugg
0922990367
std.Build.Step: send messages to compiler as little-endian
Little-endian is what `std.zig.Server` expects, but the old logic just
send the raw bytes of the struct, so sent in native endian (causing a
crash on big-endian targets).
2025-11-19 01:42:45 +01:00
Matthew Lugg
cd7d8dff26
std.zig.Server: read error bundle as little-endian
Again, `std.zig.Server` expects little-endian. This is easy; we just use
a `Reader.fixed` instead of directly `@ptrCast`ing data out of the
buffer.
2025-11-19 01:42:45 +01:00
Alex Rønne Petersen
989e05d93b
stage1: update zig1.wasm
Signed-off-by: Alex Rønne Petersen <alex@alexrp.com>
2025-11-19 01:42:45 +01:00
Alex Rønne Petersen
4b99e3718b
compiler: don't use self-hosted backends on big-endian hosts
https://github.com/ziglang/zig/issues/25961
2025-11-19 01:42:45 +01:00
Alex Rønne Petersen
959a3612c2
aro: define arch macros for s390x 2025-11-19 01:42:45 +01:00