These functions are currently footgunny when working with pointers to
arrays and slices. They just return the stated length of the array/slice
without iterating and looking for the first sentinel, even if the
array/slice is a sentinel terminated type.
From looking at the quite small list of places in the standard
library/compiler that this change breaks existing code, the new code
looks to be more readable in all cases.
The usage of std.mem.span/len was totally unneeded in most of the cases
affected by this breaking change.
We could remove these functions entirely in favor of other existing
functions in std.mem such as std.mem.sliceTo(), but that would be a
somewhat nasty breaking change as std.mem.span() is very widely used for
converting sentinel terminated pointers to slices. It is however not at
all widely used for anything else.
Therefore I think it is better to break these few non-standard and
potentially incorrect usages of these functions now and at some later
time, if deemed worthwhile, finally remove these functions.
If we wait for at least a full release cycle so that everyone adapts to
this change first, updating for the removal could be a simple find and
replace without needing to worry about the semantics.
All but 3 callsites of this function in the standard library and
compiler were unnecessary and were removed in faf2fd18.
In this commit, the remaining 3 callsites are removed. One of them
turned out to also be unnecessary and has been replaced by slicing
directly with the length..
The 2 remaining callsites were in the very pointer-math heavy
std/os/linux/vdso.zig code which should perhaps be refactored to better
utilize slices. These 2 callsites are replaced with a plain
@ptrCast([*:0]u8, ptr) though could likely use std.mem.sliceTo() if the
surrounding code was refactored.
This fixes a bug in std.net caused during the introduction of
meta.assumeSentinel due to the unfortunate semantics of mem.span()
This leaves only 3 remaining uses of meta.assumeSentinel() in the
standard library, each of which could be a simple @ptrCast([*:0]T, foo)
instead. I think this function should likely be removed.
Can occur when trying to open a directory for iteration but the 'List folder contents' permission of the directory is set to 'Deny'.
This was found because it was being triggered during PATH searching in ChildProcess.spawnWindows if a PATH entry did not have 'List folder contents' permission, so this fixes that as well (note: the behavior on hitting this during PATH searching is to treat it as the directory not existing and therefore will fail to find any executables in a directory in the PATH without 'List folder contents' permission; this matches Windows behavior which also fails to find commands in directories that do not have 'List folder contents' permission).
* revert changes to Module because the error set is consistent across
operating systems.
* remove duplicated Stat.fromSystem code and use a less redundant name.
* make fs.Dir.statFile follow symlinks, and avoid pointless control
flow through the posix layer.
This branch largely reverts 58f961f4cb. I
would like to revisit the proposal to modify the standard library in
this way and think more carefully about it before adding isAbsolute()
checks everywhere.
Instead of checking for absolute paths and current working directories
in various file system operations, there is one simple solution: allow
overriding `std.fs.cwd` on WASI.
os.realpath is back to causing a compile error when used on WASI. This
caused a compile error in the Sema handling of `@src()`. The compiler
should never call realpath, so the commit that made this change is
reverted (95ab942184). If this breaks
debug info, a different strategy is needed to solve it other than using
realpath.
I also removed the preopens code and replaced it with something much
simpler. There is no longer any global state in the standard library.
Additionally-
* os.openat no longer does an unnecessary fstat on WASI when O.WRONLY
is not provided.
* os.chdir is back to causing a compile error on WASI.
Make the test use the minimum length and set MAX_NAME_BYTES to the maximum so that:
- the test will work on any host platform
- *and* the MAX_NAME_BYTES will be able to hold the max file name component on any host platform
Each u16 within a file name component can be encoded as up to 3 UTF-8 bytes, so we need to use MAX_NAME_BYTES to account for all possible UTF-8 encoded names.
Fixes#8268
Currenty copy_file_range always uses at least two syscalls:
1. As many as it needs to do the initial copy (always 1 during my
testing)
2. The last one is always when offset is the size of the file.
The second syscall is used to detect the terminating condition. However,
because we do a stat for other reasons, we know the size of the file,
and we can skip the syscall.
Sparse files: since copy_file_range expands holes of sparse files, I
conclude that this layer was not intended to work with sparse files. In
other words, this commit does not make it worse for sparse file society.
Test program
------------
const std = @import("std");
pub fn main() !void {
const arg1 = std.mem.span(std.os.argv[1]);
const arg2 = std.mem.span(std.os.argv[2]);
try std.fs.cwd().copyFile(arg1, std.fs.cwd(), arg2, .{});
}
Test output (current master)
----------------------------
Observe two `copy_file_range` syscalls: one with 209 bytes, one with
zero:
$ zig build-exe cp.zig
$ strace ./cp ./cp.zig ./cp2.zig |& grep copy_file_range
copy_file_range(3, [0], 5, [0], 4294967295, 0) = 209
copy_file_range(3, [209], 5, [209], 4294967295, 0) = 0
$
Test output (this diff)
-----------------------
Observe a single `copy_file_range` syscall with 209 bytes:
$ /code/zig/build/zig build-exe cp.zig
$ strace ./cp ./cp.zig ./cp2.zig |& grep copy_file_range
copy_file_range(3, [0], 5, [0], 4294967295, 0) = 209
$
Due to the unavailability of fchdir in Windows, a call for setting the
CWD needs to either call chdir with the path string or call
SetCurrentDirectory.
Either way, since we are dealing with a Handle in Windows, a call for
GetFinalPathNameByHandle is necessary for getting the file path first.
Windows requires the directory handle to be closed before attempting to delete the directory, so now we do that and then re-open it if we need to retry (from getting DirNotEmpty when trying to delete).
This was sized large so that `getdents` (and other platforms' equivalents) could provide large amounts of entries per syscall, but some benchmarking seems to indicate that the larger 8192 sizing doesn't actually lead to performance gains outside of edge cases like extremely large amounts of entries within a single directory (e.g. 25,000 files in one directory), and even then the gains are minimal ('./walk-8192 dir-with-tons-of-entries' ran 1.02 ± 0.34 times faster than './walk-1024 dir-with-tons-of-entries').
Note: Sizes 1024 and 2048 had similar performance characteristics, so the smaller of the two was chosen.
`deleteTree` now uses a stack-allocated stack for the first 16 nested directories, and then falls back to the previous implementation (which only keeps 1 directory open at a time) when it runs out of room in its stack. This allows the function to perform as well as a recursive implementation for most use-cases without needing allocation or introducing the possibility of stack overflow.
There are two parts to this:
1. The deleteFile call on the sub_path has been moved outside the loop, since if the first call fails with `IsDir` then it's very likely that all the subsequent calls will do the same. Instead, if the `openIterableDir` call ever hits `NotDir` after the `deleteFile` hit `IsDir`, then we assume that the tree was deleted at some point and can consider the deleteTree a success.
2. Inside the `dir_it.next()` loop, we look at entry.kind and only try doing the relevant (deleteFile/openIterableDir) operation, but always fall back to the other if we get the relevant error (NotDir/IsDir).
Before this commit, the modified test would fail with `FileNotFound` because the `entry.dir` would be for the entry itself rather than the containing dir of the entry. That is, if you were walking a tree of `a/b`, then (previously) the entry for `b` would incorrectly have an `entry.dir` for `b` rather than `a`.
`getdents` on Linux can return `ENOENT` if the directory referred to by the fd is deleted during iteration. Returning null when this happens makes sense because:
- `ENOENT` is specific to the Linux implementation of `getdents`
- On other platforms like FreeBSD, `getdents` returns `0` in this scenario, which is functionally equivalent to the `.NOENT => return null` handling on Linux
- In all the usage sites of `Iterator.next` throughout the standard library, translating `ENOENT` returned from `next` as null was the best way to handle it, so the use-case for handling the exact `ENOENT` scenario specifically may not exist to a relevant extent
Previously, ENOENT being returned would trigger `os.unexpectedErrno`.
Closes#12211
This change adds support for locating the Zig executable and the library
and global cache directories, based on looking in the fixed "/zig" and
"/cache" directories.
Since our argv[0] on WASI is just the basename (any absolute/relative
path information is deleted by the runtime), there's very limited
introspection we can do on WASI, so we rely on these fixed directories.
These can be provided on the command-line using `--mapdir`, as follows:
```
wasmtime --mapdir=/cwd::. --mapdir=/cache::"$HOME/.cache/zig" --mapdir=/zig::./zig-out/ ./zig-out/bin/zig.wasm
```
The call to `makeDir` for the top-level component of `sub_path`
can return `error.FileNotFound` if the directory represented by
`self` has been deleted.
Fixes#11397