zig/lib/std
Ryan Liptak 59b8bed222 Teach fs.path about the wonderful world of Windows paths
Previously, fs.path handled a few of the Windows path types, but not all of them, and only a few of them correctly/consistently. This commit aims to make `std.fs.path` correct and consistent in handling all possible Win32 path types.

This commit also slightly nudges the codebase towards a separation of Win32 paths and NT paths, as NT paths are not actually distinguishable from Win32 paths from looking at their contents alone (i.e. `\Device\Foo` could be an NT path or a Win32 rooted path, no way to tell without external context). This commit formalizes `std.fs.path` being fully concerned with Win32 paths, and having no special detection/handling of NT paths.

Resources on Windows path types, and Win32 vs NT paths:

- https://googleprojectzero.blogspot.com/2016/02/the-definitive-guide-on-win32-to-nt.html
- https://chrisdenton.github.io/omnipath/Overview.html
- https://learn.microsoft.com/en-us/windows/win32/fileio/naming-a-file

API additions/changes/deprecations

- `std.os.windows.getWin32PathType` was added (it is analogous to `RtlDetermineDosPathNameType_U`), while `std.os.windows.getNamespacePrefix` and `std.os.windows.getUnprefixedPathType` were deleted. `getWin32PathType` forms the basis on which the updated `std.fs.path` functions operate.
- `std.fs.path.parsePath`, `std.fs.path.parsePathPosix`, and `std.fs.path.parsePathWindows` were added, while `std.fs.path.windowsParsePath` was deprecated. The new `parsePath` functions provide the "root" and the "kind" of a path, which is platform-specific. The now-deprecated `windowsParsePath` did not handle all possible path types, while the new `parsePathWindows` does.
- `std.fs.path.diskDesignator` has been deprecated in favor of `std.fs.path.parsePath`, and same deal with `diskDesignatorWindows` -> `parsePathWindows`
- `relativeWindows` is now a compile error when *not* targeting Windows, while `relativePosix` is now a compile error when targeting Windows. This is because those functions read/use the CWD path which will behave improperly when used from a system with different path semantics (e.g. calling `relativePosix` from a Windows system with a CWD like `C:\foo\bar` will give you a bogus result since that'd be treated as a single relative component when using POSIX semantics). This also allows `relativeWindows` to use Windows-specific APIs for getting the CWD and environment variables to cut down on allocations.
- `componentIterator`/`ComponentIterator.init` have been made infallible. These functions used to be able to error on UNC paths with an empty server component, and on paths that were assumed to be NT paths, but now:
  + We follow the lead of `RtlDetermineDosPathNameType_U`/`RtlGetFullPathName_U` in how it treats a UNC path with an empty server name (e.g. `\\\share`) and allow it, even if it'll be invalid at the time of usage
  + Now that `std.fs.path` assumes paths are Win32 paths and not NT paths, we don't have to worry about NT paths

Behavior changes

- `std.fs.path` generally: any combinations of mixed path separators for UNC paths are universally supported, e.g. `\/server/share`, `/\server\share`, `/\server/\\//share` are all seen as equivalent UNC paths
- `resolveWindows` handles all path types more appropriately/consistently.
  + `//` and `//foo` used to be treated as a relative path, but are now seen as UNC paths
  + If a rooted/drive-relative path cannot be resolved against anything more definite, the result will remain a rooted/drive-relative path.
  + I've created [a script to generate the results of a huge number of permutations of different path types](https://gist.github.com/squeek502/9eba7f19cad0d0d970ccafbc30f463bf) (the result of running the script is also included for anyone that'd like to vet the behavior).
- `dirnameWindows` now treats the drive-relative root as the dirname of a drive-relative path with a component, e.g. `dirname("C:foo")` is now `C:`, whereas before it would return null. `dirnameWindows` also handles local device paths appropriately now.
- `basenameWindows` now handles all path types more appropriately. The most notable change here is `//a` being treated as a partial UNC path now and therefore `basename` will return `""` for it, whereas before it would return `"a"`
- `relativeWindows` will now do its best to resolve against the most appropriate CWD for each path, e.g. relative for `D:foo` will look at the CWD to check if the drive letter matches, and if not, look at the special environment variable `=D:` to get the shell-defined CWD for that drive, and if that doesn't exist, then it'll resolve against `D:\`.

Implementation details

- `resolveWindows` previously looped through the paths twice to build up the relevant info before doing the actual resolution. Now, `resolveWindows` iterates backwards once and keeps track of which paths are actually relevant using a bit set, which also allows it to break from the loop when it's no longer possible for earlier paths to matter.
- A standalone test was added to test parts of `relativeWindows` since the CWD resolution logic depends on CWD information from the PEB and environment variables

Edge cases worth noting

- A strange piece of trivia that I found out while working on this is that it's technically possible to have a drive letter that it outside the intended A-Z range, or even outside the ASCII range entirely. Since we deal with both WTF-8 and WTF-16 paths, `path[0]`/`path[1]`/`path[2]` will not always refer to the same bits of information, so to get consistent behavior, some decision about how to deal with this edge case had to be made. I've made the choice to conform with how `RtlDetermineDosPathNameType_U` works, i.e. treat the first WTF-16 code unit as the drive letter. This means that when working with WTF-8, checking for drive-relative/drive-absolute paths is a bit more complicated. For more details, see the lengthy comment in `std.os.windows.getWin32PathType`
- `relativeWindows` will now almost always be able to return either a fully-qualified absolute path or a relative path, but there's one scenario where it may return a rooted path: when the CWD gotten from the PEB is not a drive-absolute or UNC path (if that's actually feasible/possible?). An alternative approach to this scenario might be to resolve against the `HOMEDRIVE` env var if available, and/or default to `C:\` as a last resort in order to guarantee the result of `relative` is never a rooted path.
- Partial UNC paths (e.g. `\\server` instead of `\\server\share`) are a bit awkward to handle, generally. Not entirely sure how best to handle them, so there may need to be another pass in the future to iron out any issues that arise. As of now the behavior is:
  + For `relative`, any part of a UNC disk designator is treated as the "root" and therefore isn't applicable for relative paths, e.g. calling `relative` with `\\server` and `\\server\share` will result in `\\server\share` rather than just `share` and if `relative` is called with `\\server\foo` and `\\server\bar` the result will be `\\server\bar` rather than `..\bar`
  + For `resolve`, any part of a UNC disk designator is also treated as the "root", but relative and rooted paths are still elligable for filling in missing portions of the disk designator, e.g. `resolve` with `\\server` and `foo` or `\foo` will result in `\\server\foo`

Fixes #25703
Closes #25702
2025-11-21 00:03:44 -08:00
..
Build Teach fs.path about the wonderful world of Windows paths 2025-11-21 00:03:44 -08:00
builtin std.builtin.assembly: add Clobbers for kvx 2025-11-10 09:40:42 +01:00
c remove all Oracle Solaris support 2025-10-27 07:35:38 -07:00
compress add deflate compression, simplify decompression 2025-09-30 18:28:47 -07:00
crypto represent Mac Catalyst as aarch64-maccatalyst-none rather than aarch64-ios-macabi 2025-11-14 11:33:35 +01:00
debug represent Mac Catalyst as aarch64-maccatalyst-none rather than aarch64-ios-macabi 2025-11-14 11:33:35 +01:00
dwarf fix compiler ftbfs from std.macho and std.dwarf changes 2025-09-30 13:44:51 +01:00
fmt
fs Teach fs.path about the wonderful world of Windows paths 2025-11-21 00:03:44 -08:00
hash Remove usages of deprecatedWriter 2025-09-18 22:39:33 -07:00
heap MemoryPool: add unmanaged variants and make them the default 2025-11-15 09:30:57 +00:00
http Revert "std.http: disable failing test on 32-bit arm" 2025-11-01 11:21:28 -04:00
Io Teach fs.path about the wonderful world of Windows paths 2025-11-21 00:03:44 -08:00
json std.debug.lockStderrWriter: also return ttyconf 2025-10-30 09:31:28 +00:00
math represent Mac Catalyst as aarch64-maccatalyst-none rather than aarch64-ios-macabi 2025-11-14 11:33:35 +01:00
mem std.mem.Allocator: fix resize doc comment 2025-10-22 11:41:16 +02:00
meta
os Teach fs.path about the wonderful world of Windows paths 2025-11-21 00:03:44 -08:00
posix represent Mac Catalyst as aarch64-maccatalyst-none rather than aarch64-ios-macabi 2025-11-14 11:33:35 +01:00
process windows.eqlIgnoreCaseWTF16 -> eqlIgnoreCaseWtf16 2025-11-16 04:03:52 -08:00
Random Remove usages of deprecatedWriter 2025-09-18 22:39:33 -07:00
sort std.sort.pdq: fix out-of-bounds access in partialInsertionSort (#25253) 2025-09-17 19:54:15 -07:00
tar compiler: update for introduction of std.Io 2025-10-29 06:20:49 -07:00
Target std.Target.x86: refresh from update_cpu_features.zig 2025-11-13 22:29:28 +01:00
testing fix compile errors and minor bugs 2025-09-30 13:44:54 +01:00
Thread std.os.windows: eliminate forwarder function in kernel32 (#25766) 2025-10-31 13:54:50 +00:00
time
tz
unicode Remove usages of deprecatedWriter 2025-09-18 22:39:33 -07:00
valgrind
zig Teach fs.path about the wonderful world of Windows paths 2025-11-21 00:03:44 -08:00
zon coerce vectors to arrays rather than inline for 2025-09-20 18:33:00 -07:00
array_hash_map.zig Coff2: create a new linker from scratch 2025-10-02 17:44:52 -04:00
array_list.zig std.ArrayList: actaully memset to undefined in shrinkRetainingCapacity and clearRetainingCapacity 2025-11-06 05:30:41 -08:00
ascii.zig std: Skip element comparisons if mem.order args point to same memory 2025-10-31 18:34:33 -07:00
atomic.zig std.atomic: define cache line size for alpha, hppa, microblaze, sh 2025-10-23 09:27:17 +02:00
base64.zig Base64DecoderWithIgnore.calcSizeUpperBound cannot return an error (#25834) 2025-11-07 08:16:34 +01:00
bit_set.zig
BitStack.zig std.ArrayList: make unmanaged the default 2025-08-11 15:52:49 -07:00
buf_map.zig
buf_set.zig
Build.zig std.debug.lockStderrWriter: also return ttyconf 2025-10-30 09:31:28 +00:00
builtin.zig represent Mac Catalyst as aarch64-maccatalyst-none rather than aarch64-ios-macabi 2025-11-14 11:33:35 +01:00
c.zig std.c: implement rusage for freebsd 2025-11-16 06:15:54 +01:00
coff.zig Coff: implement threadlocal variables 2025-10-10 22:47:47 -07:00
compress.zig std.compress: rework flate to new I/O API 2025-07-31 22:10:11 -07:00
crypto.zig Add ML-DSA post-quantum signatures (#25862) 2025-11-10 14:11:30 +01:00
debug.zig represent Mac Catalyst as aarch64-maccatalyst-none rather than aarch64-ios-macabi 2025-11-14 11:33:35 +01:00
deque.zig std: remove loop from growCapacity 2025-09-20 14:34:18 -07:00
DoublyLinkedList.zig *LinkedList.remove() assumes node is in the list 2025-10-25 21:10:02 -07:00
dwarf.zig
dynamic_library.zig represent Mac Catalyst as aarch64-maccatalyst-none rather than aarch64-ios-macabi 2025-11-14 11:33:35 +01:00
elf.zig posix: reduce the number of assumptions made by dl_iterate_phdr 2025-11-09 03:31:26 -05:00
enums.zig std.enums: fix EnumIndexer branch quota 2025-07-31 22:10:22 +01:00
fmt.zig test: add test case for enum-literal with '{t}' format 2025-11-06 13:45:21 +08:00
fs.zig represent Mac Catalyst as aarch64-maccatalyst-none rather than aarch64-ios-macabi 2025-11-14 11:33:35 +01:00
gpu.zig
hash.zig simplify std.hash.Adler32 2025-07-31 22:10:11 -07:00
hash_map.zig std.hash_map: tune slow unit tests 2025-10-29 06:20:52 -07:00
heap.zig MemoryPool: add unmanaged variants and make them the default 2025-11-15 09:30:57 +00:00
http.zig http.BodyWriter: improve clarity of chunked state machine 2025-08-17 14:42:57 +02:00
Io.zig std/Io.zig Timestamp: add toMilliseconds() 2025-11-15 16:38:33 +09:00
json.zig std.Io: delete GenericReader 2025-08-29 17:14:26 -07:00
leb128.zig std.Io: delete GenericReader 2025-08-29 17:14:26 -07:00
log.zig std.log: colorize output in default implementation 2025-10-30 09:31:30 +00:00
macho.zig std: fixes 2025-09-30 13:44:51 +01:00
math.zig std: skip some failing tests on hexagon 2025-08-30 06:36:41 +02:00
mem.zig std: Skip element comparisons if mem.order args point to same memory 2025-10-31 18:34:33 -07:00
meta.zig std: disable a few failing tests on hexagon 2025-10-16 22:11:51 +02:00
multi_array_list.zig std: remove loop from growCapacity 2025-09-20 14:34:18 -07:00
once.zig
os.zig represent Mac Catalyst as aarch64-maccatalyst-none rather than aarch64-ios-macabi 2025-11-14 11:33:35 +01:00
pdb.zig std.Io: delete GenericReader 2025-08-29 17:14:26 -07:00
pie.zig std.pie: add missing clobbers on alpha and sparc 2025-11-14 12:19:38 +01:00
posix.zig Merge pull request #25539 from squeek502/windows-readlinkw 2025-11-15 23:36:34 -08:00
priority_dequeue.zig std.ArrayList: make unmanaged the default 2025-08-11 15:52:49 -07:00
priority_queue.zig
process.zig std.process: Actually use explicit GetCwdError/GetCwdAllocError sets 2025-11-19 04:10:11 -08:00
Progress.zig represent Mac Catalyst as aarch64-maccatalyst-none rather than aarch64-ios-macabi 2025-11-14 11:33:35 +01:00
Random.zig std.Io.net: progress towards DNS resolution 2025-10-29 06:20:48 -07:00
SemanticVersion.zig std.Io: delete GenericReader 2025-08-29 17:14:26 -07:00
simd.zig std.simd: suggest 1024-bit vectors for kvx 2025-11-10 09:40:44 +01:00
SinglyLinkedList.zig SinglyLinkedList.remove docs: Assumes -> asserts 2025-10-25 21:28:54 -07:00
sort.zig
start.zig std.start: add kvx support 2025-11-10 09:40:44 +01:00
static_string_map.zig
std.zig std.Io.Threaded: install and cleanup signal handlers 2025-10-29 06:20:52 -07:00
tar.zig std.Io: add dirOpenDir and WASI impl 2025-10-29 06:20:50 -07:00
Target.zig Merge pull request #25917 from alexrp/target-features 2025-11-14 12:23:09 +01:00
testing.zig std.debug.lockStderrWriter: also return ttyconf 2025-10-30 09:31:28 +00:00
Thread.zig std.Thread: disable thread local storage test on 32-bit targets 2025-11-16 00:08:20 +01:00
time.zig represent Mac Catalyst as aarch64-maccatalyst-none rather than aarch64-ios-macabi 2025-11-14 11:33:35 +01:00
treap.zig std.ArrayList: make unmanaged the default 2025-08-11 15:52:49 -07:00
tz.zig std.tz: fix redundant endian handling 2025-08-28 18:30:57 -07:00
unicode.zig std: move some windows path checking logic 2025-10-29 06:20:50 -07:00
Uri.zig compiler: update for introduction of std.Io 2025-10-29 06:20:49 -07:00
valgrind.zig
wasm.zig
zig.zig Move/coalesce RcIncludes enum to std.zig.RcIncludes 2025-11-07 19:16:52 -08:00
zip.zig std.Io: delete GenericReader 2025-08-29 17:14:26 -07:00
zon.zig zon: Add anonymous struct literal in the example 2025-08-15 23:35:16 +02:00