Commit graph

91 commits

Author SHA1 Message Date
Ryan Liptak
822f412424 fs.path: Fix on big-endian architectures, make PathType.isSep assume WTF-16 is LE
This commit flips usage of PathType.isSep from requiring the caller to convert to native to assuming the input is LE encoded, which is a breaking change. This makes usage a bit nicer, though, and moves the endian conversion work from runtime to comptime.
2025-11-21 22:26:58 -08:00
Ryan Liptak
59b8bed222 Teach fs.path about the wonderful world of Windows paths
Previously, fs.path handled a few of the Windows path types, but not all of them, and only a few of them correctly/consistently. This commit aims to make `std.fs.path` correct and consistent in handling all possible Win32 path types.

This commit also slightly nudges the codebase towards a separation of Win32 paths and NT paths, as NT paths are not actually distinguishable from Win32 paths from looking at their contents alone (i.e. `\Device\Foo` could be an NT path or a Win32 rooted path, no way to tell without external context). This commit formalizes `std.fs.path` being fully concerned with Win32 paths, and having no special detection/handling of NT paths.

Resources on Windows path types, and Win32 vs NT paths:

- https://googleprojectzero.blogspot.com/2016/02/the-definitive-guide-on-win32-to-nt.html
- https://chrisdenton.github.io/omnipath/Overview.html
- https://learn.microsoft.com/en-us/windows/win32/fileio/naming-a-file

API additions/changes/deprecations

- `std.os.windows.getWin32PathType` was added (it is analogous to `RtlDetermineDosPathNameType_U`), while `std.os.windows.getNamespacePrefix` and `std.os.windows.getUnprefixedPathType` were deleted. `getWin32PathType` forms the basis on which the updated `std.fs.path` functions operate.
- `std.fs.path.parsePath`, `std.fs.path.parsePathPosix`, and `std.fs.path.parsePathWindows` were added, while `std.fs.path.windowsParsePath` was deprecated. The new `parsePath` functions provide the "root" and the "kind" of a path, which is platform-specific. The now-deprecated `windowsParsePath` did not handle all possible path types, while the new `parsePathWindows` does.
- `std.fs.path.diskDesignator` has been deprecated in favor of `std.fs.path.parsePath`, and same deal with `diskDesignatorWindows` -> `parsePathWindows`
- `relativeWindows` is now a compile error when *not* targeting Windows, while `relativePosix` is now a compile error when targeting Windows. This is because those functions read/use the CWD path which will behave improperly when used from a system with different path semantics (e.g. calling `relativePosix` from a Windows system with a CWD like `C:\foo\bar` will give you a bogus result since that'd be treated as a single relative component when using POSIX semantics). This also allows `relativeWindows` to use Windows-specific APIs for getting the CWD and environment variables to cut down on allocations.
- `componentIterator`/`ComponentIterator.init` have been made infallible. These functions used to be able to error on UNC paths with an empty server component, and on paths that were assumed to be NT paths, but now:
  + We follow the lead of `RtlDetermineDosPathNameType_U`/`RtlGetFullPathName_U` in how it treats a UNC path with an empty server name (e.g. `\\\share`) and allow it, even if it'll be invalid at the time of usage
  + Now that `std.fs.path` assumes paths are Win32 paths and not NT paths, we don't have to worry about NT paths

Behavior changes

- `std.fs.path` generally: any combinations of mixed path separators for UNC paths are universally supported, e.g. `\/server/share`, `/\server\share`, `/\server/\\//share` are all seen as equivalent UNC paths
- `resolveWindows` handles all path types more appropriately/consistently.
  + `//` and `//foo` used to be treated as a relative path, but are now seen as UNC paths
  + If a rooted/drive-relative path cannot be resolved against anything more definite, the result will remain a rooted/drive-relative path.
  + I've created [a script to generate the results of a huge number of permutations of different path types](https://gist.github.com/squeek502/9eba7f19cad0d0d970ccafbc30f463bf) (the result of running the script is also included for anyone that'd like to vet the behavior).
- `dirnameWindows` now treats the drive-relative root as the dirname of a drive-relative path with a component, e.g. `dirname("C:foo")` is now `C:`, whereas before it would return null. `dirnameWindows` also handles local device paths appropriately now.
- `basenameWindows` now handles all path types more appropriately. The most notable change here is `//a` being treated as a partial UNC path now and therefore `basename` will return `""` for it, whereas before it would return `"a"`
- `relativeWindows` will now do its best to resolve against the most appropriate CWD for each path, e.g. relative for `D:foo` will look at the CWD to check if the drive letter matches, and if not, look at the special environment variable `=D:` to get the shell-defined CWD for that drive, and if that doesn't exist, then it'll resolve against `D:\`.

Implementation details

- `resolveWindows` previously looped through the paths twice to build up the relevant info before doing the actual resolution. Now, `resolveWindows` iterates backwards once and keeps track of which paths are actually relevant using a bit set, which also allows it to break from the loop when it's no longer possible for earlier paths to matter.
- A standalone test was added to test parts of `relativeWindows` since the CWD resolution logic depends on CWD information from the PEB and environment variables

Edge cases worth noting

- A strange piece of trivia that I found out while working on this is that it's technically possible to have a drive letter that it outside the intended A-Z range, or even outside the ASCII range entirely. Since we deal with both WTF-8 and WTF-16 paths, `path[0]`/`path[1]`/`path[2]` will not always refer to the same bits of information, so to get consistent behavior, some decision about how to deal with this edge case had to be made. I've made the choice to conform with how `RtlDetermineDosPathNameType_U` works, i.e. treat the first WTF-16 code unit as the drive letter. This means that when working with WTF-8, checking for drive-relative/drive-absolute paths is a bit more complicated. For more details, see the lengthy comment in `std.os.windows.getWin32PathType`
- `relativeWindows` will now almost always be able to return either a fully-qualified absolute path or a relative path, but there's one scenario where it may return a rooted path: when the CWD gotten from the PEB is not a drive-absolute or UNC path (if that's actually feasible/possible?). An alternative approach to this scenario might be to resolve against the `HOMEDRIVE` env var if available, and/or default to `C:\` as a last resort in order to guarantee the result of `relative` is never a rooted path.
- Partial UNC paths (e.g. `\\server` instead of `\\server\share`) are a bit awkward to handle, generally. Not entirely sure how best to handle them, so there may need to be another pass in the future to iron out any issues that arise. As of now the behavior is:
  + For `relative`, any part of a UNC disk designator is treated as the "root" and therefore isn't applicable for relative paths, e.g. calling `relative` with `\\server` and `\\server\share` will result in `\\server\share` rather than just `share` and if `relative` is called with `\\server\foo` and `\\server\bar` the result will be `\\server\bar` rather than `..\bar`
  + For `resolve`, any part of a UNC disk designator is also treated as the "root", but relative and rooted paths are still elligable for filling in missing portions of the disk designator, e.g. `resolve` with `\\server` and `foo` or `\foo` will result in `\\server\foo`

Fixes #25703
Closes #25702
2025-11-21 00:03:44 -08:00
Andrew Kelley
aa6e8eff40 std.Io.Threaded: implement dirAccess for Windows 2025-10-29 06:20:50 -07:00
Techatrix
bd1e960bc0 fix std.fs.path.resolveWindows on UNC paths with mixed path separators 2025-10-26 02:18:11 +02:00
usebeforefree
62e3d46287 replaced https://simonsapin.github.io/wtf-8/ with https://wtf-8.codeberg.page/ 2025-10-10 23:53:00 +02:00
Andrew Kelley
150169f1e0 std.fmt: delete deprecated APIs
std.fmt.Formatter -> std.fmt.Alt
std.fmt.format -> std.Io.Writer.print
2025-08-31 12:49:18 -07:00
Andrew Kelley
79f267f6b9 std.Io: delete GenericReader
and delete deprecated alias std.io
2025-08-29 17:14:26 -07:00
Andrew Kelley
749f10af49 std.ArrayList: make unmanaged the default 2025-08-11 15:52:49 -07:00
Andrew Kelley
5360968e03 std: rename io to Io in preparation
This commit is non-breaking.

std.io is deprecated in favor of std.Io, in preparation for that
namespace becoming an interface.
2025-07-11 01:16:27 +02:00
Andrew Kelley
0e37ff0d59 std.fmt: breaking API changes
added adapter to AnyWriter and GenericWriter to help bridge the gap
between old and new API

make std.testing.expectFmt work at compile-time

std.fmt no longer has a dependency on std.unicode. Formatted printing
was never properly unicode-aware. Now it no longer pretends to be.

Breakage/deprecations:
* std.fs.File.reader -> std.fs.File.deprecatedReader
* std.fs.File.writer -> std.fs.File.deprecatedWriter
* std.io.GenericReader -> std.io.Reader
* std.io.GenericWriter -> std.io.Writer
* std.io.AnyReader -> std.io.Reader
* std.io.AnyWriter -> std.io.Writer
* std.fmt.format -> std.fmt.deprecatedFormat
* std.fmt.fmtSliceEscapeLower -> std.ascii.hexEscape
* std.fmt.fmtSliceEscapeUpper -> std.ascii.hexEscape
* std.fmt.fmtSliceHexLower -> {x}
* std.fmt.fmtSliceHexUpper -> {X}
* std.fmt.fmtIntSizeDec -> {B}
* std.fmt.fmtIntSizeBin -> {Bi}
* std.fmt.fmtDuration -> {D}
* std.fmt.fmtDurationSigned -> {D}
* {} -> {f} when there is a format method
* format method signature
  - anytype -> *std.io.Writer
  - inferred error set -> error{WriteFailed}
  - options -> (deleted)
* std.fmt.Formatted
  - now takes context type explicitly
  - no fmt string
2025-07-07 22:43:51 -07:00
Jacob Young
16d78bc0c0 Build: add install commands to --verbose output 2025-06-19 11:45:06 -04:00
mlugg
5202c977d9
std: add fs.path.fmtJoin
This allows joining paths without allocating using a `Writer`.
2025-01-25 04:48:00 +00:00
Andrew Kelley
0bacb79c09 Revert "Merge pull request #21540 from BratishkaErik/search-env-in-path"
It caused an assertion failure when building Zig from source.

This reverts commit 0595feb341, reversing
changes made to 744771d330.

closes #22566
closes #22568
2025-01-21 11:22:28 -08:00
Eric Joldasov
5bbf3f5561
std.fs.path.joinSepMaybeZ: replace while-loops with for-loops
Signed-off-by: Eric Joldasov <bratishkaerik@landless-city.net>
2025-01-20 14:29:01 +05:00
Andrew Kelley
12191c8a22 std: promote tests to doctests
Now these show up as "example usage" in generated documentation.
2024-03-21 14:11:46 -07:00
Ryan Liptak
9fec608b3b Add std.fs.path.fmtAsUtf8Lossy/fmtWtf16LeAsUtf8Lossy 2024-02-24 14:05:24 -08:00
Ryan Liptak
68b87918df Fix handling of Windows (WTF-16) and WASI (UTF-8) paths
Windows paths now use WTF-16 <-> WTF-8 conversion everywhere, which is lossless. Previously, conversion of ill-formed UTF-16 paths would either fail or invoke illegal behavior.

WASI paths must be valid UTF-8, and the relevant function calls have been updated to handle the possibility of failure due to paths not being encoded/encodable as valid UTF-8.

Closes #18694
Closes #1774
Closes #2565
2024-02-24 14:05:24 -08:00
Andrew Kelley
3fc6fc6812 std.builtin.Endian: make the tags lower case
Let's take this breaking change opportunity to fix the style of this
enum.
2023-10-31 21:37:35 -04:00
Ryan Liptak
c139b9d4ad path.ComponentIterator: Add peekNext and peekPrevious functions 2023-10-18 20:42:59 -07:00
Andrew Kelley
efbfa8ef3e std.fs.path.resolve: add test cases for empty string
Since I'm about to rely on this behavior.
2023-10-08 19:18:21 -07:00
Ryan Liptak
49053cb1b4 Add fs.path.ComponentIterator and use it in Dir.makePath
Before this commit, there were three issues with the makePath implementation:

1. The component iteration did not 'collapse' consecutive path separators; instead, it would treat `a/b//c` as `a/b//c`, `a/b/`, `a/b`, and `a`.
2. Trailing path separators led to redundant `makeDir` calls, e.g. with the path `a/b/` (if `a` doesn't exist), it would try to create `a/b/`, then try `a/b`, then try `a`, then try `a/b`, and finally try `a/b/` again.
3. The iteration did not treat the root of a path specially, so on Windows it could attempt to make a directory with a path like `X:` for something like `X:\a\b\c` if the `X:\` drive doesn't exist. This didn't lead to any problems that I could find, but there's no reason to try to make a drive letter as a directory (or any other root path).

This commit fixes all three issues by introducing a ComponentIterator that is root-aware and handles both sequential path separators and trailing path separators and uses it in `Dir.makePath`. This reduces the number of `makeDir` calls for paths where (1) the root of the path doesn't exist, (2) there are consecutive path separators, or (3) there are trailing path separators

As an example, here are the makeDir calls that would have been made before this commit when calling `makePath` for a relative path like `a/b//c//` (where the full path needs to be created):

a/b//c//
a/b//c/
a/b//c
a/b/
a/b
a
a/b
a/b/
a/b//c
a/b//c/
a/b//c//

And after this commit:

a/b//c
a/b
a
a/b
a/b//c
2023-07-27 10:22:54 -07:00
Ryan Liptak
2eae013378 fs.path: Fix Windows path component comparison being ASCII-only
We can use eqlIgnoreCaseUtf8 to get Unicode-aware Windows-compliant case insensitive path component comparison
2023-06-30 15:29:43 -07:00
Eric Joldasov
50339f595a all: zig fmt and rename "@XToY" to "@YFromX"
Signed-off-by: Eric Joldasov <bratishkaerik@getgoogleoff.me>
2023-06-19 12:34:42 -07:00
Ryan Liptak
815e53b147 Update all std.mem.tokenize calls to their appropriate function
Everywhere that can now use `tokenizeScalar` should get a nice little performance boost.
2023-05-13 13:45:04 -07:00
Linus Groh
94e30a756e std: fix a bunch of typos
The majority of these are in comments, some in doc comments which might
affect the generated documentation, and a few in parameter names -
nothing that should be breaking, however.
2023-04-30 18:16:04 -07:00
Andrew Kelley
6261c13731 update codebase to use @memset and @memcpy 2023-04-28 13:24:43 -07:00
zooster
bc8e1e1de4
Improvements to docs and text
* docs(std.math): elaborate on difference between absCast and absInt

* docs(std.rand.Random.weightedIndex): elaborate on likelihood

I think this makes it easier to understand.

* langref: add small reminder

* docs(std.fs.path.extension): brevity

* docs(std.bit_set.StaticBitSet): mention the specific types

* std.debug.TTY: explain what purpose this struct serves

This should also make it clearer that this struct is not supposed to provide unrelated terminal manipulation functionality such as setting the cursor position or something because terminals are complicated and we should keep this struct simple and focused on debugging.

* langref(package listing): brevity

* langref: explain what exactly `threadlocal` causes to happen

* std.array_list: link between swapRemove and orderedRemove

Maybe this can serve as a TLDR and make it easier to decide.

* PrefetchOptions.locality: clarify docs that this is a range

This confused me previously and I thought I can only use either 0 or 3.

* fix typos and more

* std.builtin.CallingConvention: document some CCs

* langref: explain possibly cryptic names

I think it helps knowing what exactly these acronyms (@clz and @ctz) and
abbreviations (@popCount) mean.

* variadic function error: add missing preposition

* std.fmt.format docs: nicely hyphenate

* help menu: say what to optimize for

I think this is slightly more specific than just calling it
"optimizations". These are speed optimizations. I used the word
"performance" here.
2023-04-23 21:06:21 +03:00
Jacob Young
ad5fb4879b std: fix memory bugs
This fixes logged errors during CI based on the new GPA checks.
2023-04-05 08:23:07 +02:00
Andrew Kelley
aeaef8c0ff update std lib and compiler sources to new for loop syntax 2023-02-18 19:17:21 -07:00
pluick
2d617c482c
Fix cache-dir specified on the command line (#14076)
The resolvePosix and resolveWindows routines changed behaviour in an
earlier commit so that the return value is not always an absolute path.
That caused the relativePosix and relativeWindows to return a relative
path that is not correct.

The change in behaviour mentioned above would cause a local cache-dir to
be created in the wrong directory when --cache-dir was specified for a
build.
2023-01-05 01:37:00 -08:00
Sizhe Zhao
4fa027a3cc docs: Clarify that std.fs.path.resolve doesn't resolve relative path to
absolute path
2023-01-03 12:49:05 +02:00
Andrew Kelley
ceb0a632cf std.mem.Allocator: allow shrink to fail
closes #13535
2022-11-29 23:30:38 -07:00
Andrew Kelley
3ae4931dc1 CLI: more careful resolution of paths
In general, we prefer compiler code to use relative paths based on open
directory handles because this is the most portable. However, sometimes
absolute paths are used, and sometimes relative paths are used that go
up a directory.

The recent improvements in 81d2135ca6
regressed the use case when an absolute path is used for the zig lib
directory mixed with a relative path used for the root source file. This
could happen when, for example, running the standard library tests, like
this:

stage3/bin/zig test ../lib/std/std.zig

This happened because the zig lib dir was inferred to be an absolute
directory based on the zig executable directory, while the root source
file was detected as a relative path. There was no common prefix and so
it was not determined that the std.zig file was inside the lib
directory.

This commit adds a function for resolving paths that preserves relative
path names while allowing absolute paths, and converting relative
upwards paths (e.g. "../foo") to absolute paths. This restores the
previous functionality while remaining compatible with systems such as
WASI that cannot deal with absolute paths.
2022-11-28 01:23:39 -05:00
Andrew Kelley
d24aaf8847 std.fs.path.resolve: eliminate getcwd() syscall
This is a breaking change to the API. Instead of the first path
implicitly being the current working directory, it now asserts that the
number of paths passed is greater than zero.

Importantly, it never calls getcwd(); instead, it can possibly return
".", or a series of "../". This changes the error set to only be
`error{OutOfMemory}`.

closes #13613
2022-11-22 20:57:56 -07:00
r00ster91
7721c0cbef std.fs.path: add stem() 2022-10-24 18:06:40 +02:00
r00ster91
a0a50955f0 docs(std.fs.path.extension): correct arrow alignment 2022-10-24 18:06:40 +02:00
noiryuh
0be46866fe use std.ascii instead of defining ascii functions in std.fs.path 2022-09-23 12:19:09 +03:00
Evin Yulo
dab5bb9247 Fix docstring for std.fs.path.extension 2022-09-22 20:13:09 -04:00
Veikka Tuominen
b55a5007fa Sema: fix parameter of type 'T' must be comptime error
Closes #12519
Closes #12505
2022-08-22 11:16:36 +03:00
Cody Tapscott
7b090df668 stdlib std.os: Improve wasi-libc parity for WASI CWD emulation
Two major changes here:
  1. We store the CWD as a simple `[]const u8` and lookup Preopens for
     every absolute or CWD-referenced file operation, based on the
     Preopen with the longest match (i.e. most specific path)
  2. Preorders are normalized to POSIX absolute paths at init time.
     Behavior depends on the "cwd_root" parameter of `initPreopensWasi`:

	`cwd_root` is used for any Preopens that start with "."

	  For example:
            "./foo/bar" - inits to -> "{cwd_root}/foo/bar"
            "foo/bar"   - inits to -> "/foo/bar"
	    "/foo/bar"  - inits to -> "/foo/bar"

        `cwd_root` must be an absolute path.

	Using "/" as `cwd_root` gives behavior similar to wasi-libc.
2022-04-16 18:08:05 +02:00
Cody Tapscott
ade2d0c6a2 stdlib WASI: Add realpath() support for non-absolute Preopens 2022-03-03 14:31:49 -07:00
Cody Tapscott
58f961f4cb stdlib: Add emulated CWD to std.os for WASI targets
This adds a special CWD file descriptor, AT.FDCWD (-2), to refer to the
current working directory. The `*at(...)` functions look for this and
resolve relative paths against the stored CWD. Absolute paths are
dynamically matched against the stored Preopens.

"os.initPreopensWasi()" must be called before std.os functions will
resolve relative or absolute paths correctly. This is asserted at
runtime.

Support has been added for: `open`, `rename`, `mkdir`, `rmdir`, `chdir`,
`fchdir`, `link`, `symlink`, `unlink`, `readlink`, `fstatat`, `access`,
and `faccessat`.

This also includes limited support for `getcwd()` and `realpath()`.
These return an error if the CWD does not correspond to a Preopen with
an absolute path. They also do not currently expand symlinks.
2022-03-03 14:31:49 -07:00
Andrew Kelley
3d89ff5130 std.fs.path: revert recent public API change
41fd343508 made a breaking change to the
public API; this commit reverts the API changes but keeps the
improved logic.
2022-01-11 11:00:19 -07:00
fifty-six
41fd343508 std: fix path joining on UEFI
UEFI uses `\` for paths exclusively. This changes std.fs.path to use `\`
for UEFI path joining. Also adds a few tests regarding it, specifically
in making sure double-separators do not result from path joining, as the
UEFI spec says to convert any that result from joining into single
separators (UEFI Spec Version 2.7, pg. 448).
2022-01-11 10:49:40 -07:00
Lee Cannon
85de022c56
allocgate: std Allocator interface refactor 2021-11-30 23:32:47 +00:00
Andrew Kelley
902df103c6 std lib API deprecations for the upcoming 0.9.0 release
See #3811
2021-11-30 00:13:07 -07:00
LemonBoy
2551946a51 std: Fix path resolution on Windows
GetCurrentDirectory returns a path with a trailing slash iff the cwd is
a root directory, making the code in `resolveWindows` return an invalid
path with two consecutive slashes.

Closes #10093
2021-11-04 21:05:14 -04:00
Andrew Kelley
d29871977f remove redundant license headers from zig standard library
We already have a LICENSE file that covers the Zig Standard Library. We
no longer need to remind everyone that the license is MIT in every single
file.

Previously this was introduced to clarify the situation for a fork of
Zig that made Zig's LICENSE file harder to find, and replaced it with
their own license that required annual payments to their company.
However that fork now appears to be dead. So there is no need to
reinforce the copyright notice in every single file.
2021-08-24 12:25:09 -07:00
Ryan Liptak
d31352ee85 Update all usages of mem.split/mem.tokenize for generic version 2021-08-06 02:01:47 -07:00
Austin Clements
eb010ce65d Skip empty strings in std.fs.path.join function 2021-07-28 09:51:48 +03:00