Commit graph

41 commits

Author SHA1 Message Date
Ryan Liptak
59b8bed222 Teach fs.path about the wonderful world of Windows paths
Previously, fs.path handled a few of the Windows path types, but not all of them, and only a few of them correctly/consistently. This commit aims to make `std.fs.path` correct and consistent in handling all possible Win32 path types.

This commit also slightly nudges the codebase towards a separation of Win32 paths and NT paths, as NT paths are not actually distinguishable from Win32 paths from looking at their contents alone (i.e. `\Device\Foo` could be an NT path or a Win32 rooted path, no way to tell without external context). This commit formalizes `std.fs.path` being fully concerned with Win32 paths, and having no special detection/handling of NT paths.

Resources on Windows path types, and Win32 vs NT paths:

- https://googleprojectzero.blogspot.com/2016/02/the-definitive-guide-on-win32-to-nt.html
- https://chrisdenton.github.io/omnipath/Overview.html
- https://learn.microsoft.com/en-us/windows/win32/fileio/naming-a-file

API additions/changes/deprecations

- `std.os.windows.getWin32PathType` was added (it is analogous to `RtlDetermineDosPathNameType_U`), while `std.os.windows.getNamespacePrefix` and `std.os.windows.getUnprefixedPathType` were deleted. `getWin32PathType` forms the basis on which the updated `std.fs.path` functions operate.
- `std.fs.path.parsePath`, `std.fs.path.parsePathPosix`, and `std.fs.path.parsePathWindows` were added, while `std.fs.path.windowsParsePath` was deprecated. The new `parsePath` functions provide the "root" and the "kind" of a path, which is platform-specific. The now-deprecated `windowsParsePath` did not handle all possible path types, while the new `parsePathWindows` does.
- `std.fs.path.diskDesignator` has been deprecated in favor of `std.fs.path.parsePath`, and same deal with `diskDesignatorWindows` -> `parsePathWindows`
- `relativeWindows` is now a compile error when *not* targeting Windows, while `relativePosix` is now a compile error when targeting Windows. This is because those functions read/use the CWD path which will behave improperly when used from a system with different path semantics (e.g. calling `relativePosix` from a Windows system with a CWD like `C:\foo\bar` will give you a bogus result since that'd be treated as a single relative component when using POSIX semantics). This also allows `relativeWindows` to use Windows-specific APIs for getting the CWD and environment variables to cut down on allocations.
- `componentIterator`/`ComponentIterator.init` have been made infallible. These functions used to be able to error on UNC paths with an empty server component, and on paths that were assumed to be NT paths, but now:
  + We follow the lead of `RtlDetermineDosPathNameType_U`/`RtlGetFullPathName_U` in how it treats a UNC path with an empty server name (e.g. `\\\share`) and allow it, even if it'll be invalid at the time of usage
  + Now that `std.fs.path` assumes paths are Win32 paths and not NT paths, we don't have to worry about NT paths

Behavior changes

- `std.fs.path` generally: any combinations of mixed path separators for UNC paths are universally supported, e.g. `\/server/share`, `/\server\share`, `/\server/\\//share` are all seen as equivalent UNC paths
- `resolveWindows` handles all path types more appropriately/consistently.
  + `//` and `//foo` used to be treated as a relative path, but are now seen as UNC paths
  + If a rooted/drive-relative path cannot be resolved against anything more definite, the result will remain a rooted/drive-relative path.
  + I've created [a script to generate the results of a huge number of permutations of different path types](https://gist.github.com/squeek502/9eba7f19cad0d0d970ccafbc30f463bf) (the result of running the script is also included for anyone that'd like to vet the behavior).
- `dirnameWindows` now treats the drive-relative root as the dirname of a drive-relative path with a component, e.g. `dirname("C:foo")` is now `C:`, whereas before it would return null. `dirnameWindows` also handles local device paths appropriately now.
- `basenameWindows` now handles all path types more appropriately. The most notable change here is `//a` being treated as a partial UNC path now and therefore `basename` will return `""` for it, whereas before it would return `"a"`
- `relativeWindows` will now do its best to resolve against the most appropriate CWD for each path, e.g. relative for `D:foo` will look at the CWD to check if the drive letter matches, and if not, look at the special environment variable `=D:` to get the shell-defined CWD for that drive, and if that doesn't exist, then it'll resolve against `D:\`.

Implementation details

- `resolveWindows` previously looped through the paths twice to build up the relevant info before doing the actual resolution. Now, `resolveWindows` iterates backwards once and keeps track of which paths are actually relevant using a bit set, which also allows it to break from the loop when it's no longer possible for earlier paths to matter.
- A standalone test was added to test parts of `relativeWindows` since the CWD resolution logic depends on CWD information from the PEB and environment variables

Edge cases worth noting

- A strange piece of trivia that I found out while working on this is that it's technically possible to have a drive letter that it outside the intended A-Z range, or even outside the ASCII range entirely. Since we deal with both WTF-8 and WTF-16 paths, `path[0]`/`path[1]`/`path[2]` will not always refer to the same bits of information, so to get consistent behavior, some decision about how to deal with this edge case had to be made. I've made the choice to conform with how `RtlDetermineDosPathNameType_U` works, i.e. treat the first WTF-16 code unit as the drive letter. This means that when working with WTF-8, checking for drive-relative/drive-absolute paths is a bit more complicated. For more details, see the lengthy comment in `std.os.windows.getWin32PathType`
- `relativeWindows` will now almost always be able to return either a fully-qualified absolute path or a relative path, but there's one scenario where it may return a rooted path: when the CWD gotten from the PEB is not a drive-absolute or UNC path (if that's actually feasible/possible?). An alternative approach to this scenario might be to resolve against the `HOMEDRIVE` env var if available, and/or default to `C:\` as a last resort in order to guarantee the result of `relative` is never a rooted path.
- Partial UNC paths (e.g. `\\server` instead of `\\server\share`) are a bit awkward to handle, generally. Not entirely sure how best to handle them, so there may need to be another pass in the future to iron out any issues that arise. As of now the behavior is:
  + For `relative`, any part of a UNC disk designator is treated as the "root" and therefore isn't applicable for relative paths, e.g. calling `relative` with `\\server` and `\\server\share` will result in `\\server\share` rather than just `share` and if `relative` is called with `\\server\foo` and `\\server\bar` the result will be `\\server\bar` rather than `..\bar`
  + For `resolve`, any part of a UNC disk designator is also treated as the "root", but relative and rooted paths are still elligable for filling in missing portions of the disk designator, e.g. `resolve` with `\\server` and `foo` or `\foo` will result in `\\server\foo`

Fixes #25703
Closes #25702
2025-11-21 00:03:44 -08:00
Alex Rønne Petersen
606c7bcc89
test: disable standalone tsan test
https://github.com/ziglang/zig/issues/25471
2025-10-05 02:13:21 +02:00
Alex Rønne Petersen
3539cad176
test: remove standalone sigpipe test
This should be restored, but there's no point keeping disabled code that's just
going to bitrot.

https://github.com/ziglang/zig/issues/25466
2025-10-04 20:53:19 +02:00
Alex Rønne Petersen
d9cd4d0876
test: remove standalone options test
This functionality is already load-bearing for the compiler's own build script
so this (disabled, possibly bitrotted?) doesn't really add value.
2025-10-04 20:51:07 +02:00
Alex Rønne Petersen
f6c1864abf
test: remove standalone issue_13970 test
It's been disabled for ages and has bitrotted. Someone can readd it later if
they feel like it actually adds value.
2025-10-04 20:51:07 +02:00
Alex Rønne Petersen
c68f9bc207
test: remove some tests that are now covered well enough by test-stack-traces
The amount of cross compilation required for these tests was too time-consuming
for how much value they added. test-stack-traces now cover these well enough,
especially as we add more exotic machines to the CI fleet to run native tests.
2025-10-04 20:51:07 +02:00
Alex Rønne Petersen
d97954a8ea
test: remove stack_iterator standalone test
Our new stack trace tests cover all the important parts of this.
2025-10-01 01:06:13 +02:00
Alex Rønne Petersen
212715f62d
test: remove pie test case from test-standalone
We already have test/cases/pie_linux.zig covering this.
2025-09-26 01:33:38 +02:00
Andrew Kelley
220c679523
Merge pull request #25197 from rootbeer/24380-flaky-sigset-test
Re-enable std.posix "sigset_t bits" test
2025-09-17 21:19:01 -07:00
Alex Rønne Petersen
a5bb7108a9
test: move glibc_compat from link to standalone tests
This is not really testing the linker.
2025-09-16 23:39:29 +02:00
Alex Rønne Petersen
d5c73a44b7
test: rename issue_8550 standalone test to compile_asm 2025-09-16 14:51:29 +02:00
Pat Tullmann
c614d8d008 standalone posix tests: add skeleton
Add build.zig, README and empty test files.
2025-09-09 22:07:44 -07:00
David Rubin
826b33863f test: add a standalone test for compiling tsan 2025-08-11 03:02:54 +02:00
Carl Åstholm
00bc72b5ff Add standalone test case for passing options to dependencies 2025-07-20 18:28:36 +02:00
Carl Åstholm
b92b55ab8e Update test build.zig.zon files to conform to the new manifest rules 2025-07-20 18:28:36 +02:00
dweiller
94e0c85b6b update dump-cov for alignment and writergate changes 2025-07-15 23:57:56 +02:00
Andrew Kelley
34f64432b0 remove usingnamespace from the language
closes #20663
2025-07-07 13:39:48 -07:00
mlugg
a02a2219ee
standalone: add accidentally-excluded test 2025-06-20 00:24:57 +01:00
mlugg
8aab222ffb Compilation: add missing link file options to cache manifest
Also add a standalone test which covers the `-fentry` case. It does this
by performing two reproducible compilations which are identical other
than having different entry points, and checking whether the emitted
binaries are identical (they should *not* be).

Resolves: #23869
2025-06-17 15:33:50 +01:00
tjog
3ed159964a
libfuzzer: add standalone test for libfuzzer initialization 2025-05-03 17:15:59 +02:00
mlugg
927f233ff8 compiler: allow emitting tests to an object file
This is fairly straightforward; the actual compiler changes are limited
to the CLI, since `Compilation` already supports this combination.

A new `std.Build` API is introduced to allow representing this. By
passing the `emit_object` option to `std.Build.addTest`, you get a
`Step.Compile` which emits an object file; you can then use that as you
would any other object, such as either installing it for external use,
or linking it into another step.

A standalone test is added to cover the build system API. It builds a
test into an object, and links it into a final executable, which it then
runs.

Using this build system mechanism prevents the build system from
noticing that you're running a `zig test`, so the build runner and test
runner do not communicate over stdio. However, that's okay, because the
real-world use cases for this feature don't want to do that anyway!

Resolves: #23374
2025-04-22 22:50:36 +01:00
Andrew Kelley
08a6c4ca9b
Merge pull request #23272 from squeek502/getenvw-optim
Windows: Faster `getenvW` and a standalone environment variable test
2025-04-11 15:46:34 -04:00
GalaxyShard
b5a5260546 std.Build: implement addEmbedPath for adding C #embed search directories 2025-03-27 09:47:42 +01:00
Ryan Liptak
752e7c0fd0 Add standalone test for environment variables
Tests all environment variable APIs in std.process
2025-03-22 15:44:27 -07:00
Techatrix
c390f55e72
add a standalone test for autoconf style addConfigHeader 2025-02-19 09:34:25 +01:00
Alex Rønne Petersen
45bb4f955c
test: Add a standalone test for omitting CFI directives. 2025-01-19 02:15:30 +01:00
Jacob Young
0bf44c3093 x86_64: fix @errorName data
The final offset was clobbering the first error name, which is revealed
by an out of bounds when the global error set is empty.

Closes #22362
2025-01-05 17:15:56 -05:00
Robin Voetter
80999391d9
re-enable emit_asm_and_bin and emit_llvm_no_bin tests
These were fixed during the last few commits too. The emit_llvm_no_bin test
is renamed from the issue_12588 test.

Closes #17484
2024-08-19 19:09:13 +02:00
Robin Voetter
294ca6563e
add standalone test for only dependending on the emitted assembly and not the bin 2024-08-19 19:09:13 +02:00
Techatrix
c746d7a35d test: check output file caching of run steps with side-effects 2024-07-21 02:03:32 -07:00
Jonathan Marler
d9f1a952b8 build: fix WriteFile and addCSourceFiles not adding LazyPath deps
Adds a missing call to addLazyPathDependenciesOnly in
std.Build.Module.addCSourceFiles.  Also fixes an issue in
std.Build.Step.WriteFile where it wasn't updating all the GeneratedFile
instances for every directory.  To fix the second issue, I removed
all the GeneratedFile instances and now all files/directories reference
the steps main GeneratedFile via sub paths.
2024-07-04 13:34:17 -04:00
Sashko
a717ac0340 Add a standalone test to cover the duplicate module bug 2024-07-01 16:36:39 -07:00
Michael Dusan
2cd536d7e8 libcxx: fix building when -fsingle-threaded
* Skip building libcxx mt-only source files when single-threaded.
* This change is required for llvm18 libcxx.
* Add standalone test to link a trivial:
    - mt-executable with libcxx
    - st-executable with libcxx
2024-06-08 15:34:19 -04:00
Jacob Young
dee9f82f69 Run: add output directory arguments
This allows running commands that take an output directory argument. The
main thing that was needed for this feature was generated file subpaths,
to allow access to the files in a generated directory. Additionally, a
minor change was required to so that the correct directory is created
for output directory args.
2024-05-05 15:58:08 -04:00
Andrew Kelley
9d64332a59
Merge pull request #19698 from squeek502/windows-batbadbut
std.process.Child: Mitigate arbitrary command execution vulnerability on Windows (BatBadBut)
2024-04-24 13:49:43 -07:00
Ryan Liptak
422464d540 std.process.Child: Mitigate arbitrary command execution vulnerability on Windows (BatBadBut)
> Note: This first part is mostly a rephrasing of https://flatt.tech/research/posts/batbadbut-you-cant-securely-execute-commands-on-windows/
> See that article for more details

On Windows, it is possible to execute `.bat`/`.cmd` scripts via CreateProcessW. When this happens, `CreateProcessW` will (under-the-hood) spawn a `cmd.exe` process with the path to the script and the args like so:

    cmd.exe /c script.bat arg1 arg2

This is a problem because:

- `cmd.exe` has its own, separate, parsing/escaping rules for arguments
- Environment variables in arguments will be expanded before the `cmd.exe` parsing rules are applied

Together, this means that (1) maliciously constructed arguments can lead to arbitrary command execution via the APIs in `std.process.Child` and (2) escaping according to the rules of `cmd.exe` is not enough on its own.

A basic example argv field that reproduces the vulnerability (this will erroneously spawn `calc.exe`):

    .argv = &.{ "test.bat", "\"&calc.exe" },

And one that takes advantage of environment variable expansion to still spawn calc.exe even if the args are properly escaped for `cmd.exe`:

    .argv = &.{ "test.bat", "%CMDCMDLINE:~-1%&calc.exe" },

(note: if these spawned e.g. `test.exe` instead of `test.bat`, they wouldn't be vulnerable; it's only `.bat`/`.cmd` scripts that are vulnerable since they go through `cmd.exe`)

Zig allows passing `.bat`/`.cmd` scripts as `argv[0]` via `std.process.Child`, so the Zig API is affected by this vulnerability. Note also that Zig will search `PATH` for `.bat`/`.cmd` scripts, so spawning something like `foo` may end up executing `foo.bat` somewhere in the PATH (the PATH searching of Zig matches the behavior of cmd.exe).

> Side note to keep in mind: On Windows, the extension is significant in terms of how Windows will try to execute the command. If the extension is not `.bat`/`.cmd`, we know that it will not attempt to be executed as a `.bat`/`.cmd` script (and vice versa). This means that we can just look at the extension to know if we are trying to execute a `.bat`/`.cmd` script.

---

This general class of problem has been documented before in 2011 here:

https://learn.microsoft.com/en-us/archive/blogs/twistylittlepassagesallalike/everyone-quotes-command-line-arguments-the-wrong-way

and the course of action it suggests for escaping when executing .bat/.cmd files is:

- Escape first using the non-cmd.exe rules
- Then escape all cmd.exe 'metacharacters' (`(`, `)`, `%`, `!`, `^`, `"`, `<`, `>`, `&`, and `|`) with `^`

However, escaping with ^ on its own is insufficient because it does not stop cmd.exe from expanding environment variables. For example:

```
args.bat %PATH%
```

escaped with ^ (and wrapped in quotes that are also escaped), it *will* stop cmd.exe from expanding `%PATH%`:

```
> args.bat ^"^%PATH^%^"
"%PATH%"
```

but it will still try to expand `%PATH^%`:

```
set PATH^^=123
> args.bat ^"^%PATH^%^"
"123"
```

The goal is to stop *all* environment variable expansion, so this won't work.

Another problem with the ^ approach is that it does not seem to allow all possible command lines to round trip through cmd.exe (as far as I can tell at least).

One known example:

```
args.bat ^"\^"key^=value\^"^"
```

where args.bat is:

```
@echo %1 %2 %3 %4 %5 %6 %7 %8 %9
```

will print

```
"\"key value\""
```

(it will turn the `=` into a space for an unknown reason; other minor variations do roundtrip, e.g. `\^"key^=value\^"`, `^"key^=value^"`, so it's unclear what's going on)

It may actually be possible to escape with ^ such that every possible command line round trips correctly, but it's probably not worth the effort to figure it out, since the suggested mitigation for BatBadBut has better roundtripping and leads to less garbled command lines overall.

---

Ultimately, the mitigation used here is the same as the one suggested in:

https://flatt.tech/research/posts/batbadbut-you-cant-securely-execute-commands-on-windows/

The mitigation steps are reproduced here, noted with one deviation that Zig makes (following Rust's lead):

1. Replace percent sign (%) with %%cd:~,%.
2. Replace the backslash (\) in front of the double quote (") with two backslashes (\\).
3. Replace the double quote (") with two double quotes ("").
4. ~~Remove newline characters (\n).~~
  - Instead, `\n`, `\r`, and NUL are disallowed and will trigger `error.InvalidBatchScriptArg` if they are found in `argv`. These three characters do not roundtrip through a `.bat` file and therefore are of dubious/no use. It's unclear to me if `\n` in particular is relevant to the BatBadBut vulnerability (I wasn't able to find a reproduction with \n and the post doesn't mention anything about it except in the suggested mitigation steps); it just seems to act as a 'end of arguments' marker and therefore anything after the `\n` is lost (and same with NUL). `\r` seems to be stripped from the command line arguments when passed through a `.bat`/`.cmd`, so that is also disallowed to ensure that `argv` can always fully roundtrip through `.bat`/`.cmd`.
5. Enclose the argument with double quotes (").

The escaped command line is then run as something like:

    cmd.exe /d /e:ON /v:OFF /c "foo.bat arg1 arg2"

Note: Previously, we would pass `foo.bat arg1 arg2` as the command line and the path to `foo.bat` as the app name and let CreateProcessW handle the `cmd.exe` spawning for us, but because we need to pass `/e:ON` and `/v:OFF` to cmd.exe to ensure the mitigation is effective, that is no longer tenable. Instead, we now get the full path to `cmd.exe` and use that as the app name when executing `.bat`/`.cmd` files.

---

A standalone test has also been added that tests two things:

1. Known reproductions of the vulnerability are tested to ensure that they do not reproduce the vulnerability
2. Randomly generated command line arguments roundtrip when passed to a `.bat` file and then are passed from the `.bat` file to a `.exe`. This fuzz test is as thorough as possible--it tests that things like arbitrary Unicode codepoints and unpaired surrogates roundtrip successfully.

Note: In order for the `CreateProcessW` -> `.bat` -> `.exe` roundtripping to succeed, the .exe must split the arguments using the post-2008 C runtime argv splitting implementation, see https://github.com/ziglang/zig/pull/19655 for details on when that change was made in Zig.
2024-04-23 03:21:51 -07:00
Meghan Denny
0820aa4a46 std: fix and add test for Build.dependencyFromBuildZig 2024-04-18 03:07:44 -07:00
Ryan Liptak
f4c4c04f1c ArgIteratorWindows: Match post-2008 C runtime rather than CommandLineToArgvW
On Windows, the command line arguments of a program are a single WTF-16 encoded string and it's up to the program to split it into an array of strings. In C/C++, the entry point of the C runtime takes care of splitting the command line and passing argc/argv to the main function.

https://github.com/ziglang/zig/pull/18309 updated ArgIteratorWindows to match the behavior of CommandLineToArgvW, but it turns out that CommandLineToArgvW's behavior does not match the behavior of the C runtime post-2008. In 2008, the C runtime argv splitting changed how it handles consecutive double quotes within a quoted argument (it's now considered an escaped quote, e.g. `"foo""bar"` post-2008 would get parsed into `foo"bar`), and the rules around argv[0] were also changed.

This commit makes ArgIteratorWindows match the behavior of the post-2008 C runtime, and adds a standalone test that verifies the behavior matches both the MSVC and MinGW argv splitting exactly in all cases (it checks that randomly generated command line strings get split the same way).

The motivation here is roughly the same as when the same change was made in Rust (https://github.com/rust-lang/rust/pull/87580), that is (paraphrased):

- Consistent behavior between Zig and modern C/C++ programs
- Allows users to escape double quotes in a way that can be more straightforward

Additionally, the suggested mitigation for BatBadBut (https://flatt.tech/research/posts/batbadbut-you-cant-securely-execute-commands-on-windows/) relies on the post-2008 argv splitting behavior for roundtripping of the arguments given to `cmd.exe`. Note: it's not necessary for the suggested mitigation to work, but it is necessary for the suggested escaping to be parsed back into the intended argv by ArgIteratorWindows after being run through a `.bat` file.
2024-04-15 02:09:48 -07:00
Carl Åstholm
33b8fdb6bc Turn "simple" standalone test cases into a package 2024-04-07 16:05:54 -07:00
Carl Åstholm
d6ecfa7025 Move "simple" standalone test cases to a new directory 2024-04-07 16:05:54 -07:00
Carl Åstholm
3ed221f149 Promote standalone test cases to packages
This is a prerequisite for removing `b.anonymousDependency()`, but
having compiler tests dogfood package management might be a good idea
in general.
2024-04-07 16:05:53 -07:00