Each u16 within a file name component can be encoded as up to 3 UTF-8 bytes, so we need to use MAX_NAME_BYTES to account for all possible UTF-8 encoded names.
Fixes#8268
Currenty copy_file_range always uses at least two syscalls:
1. As many as it needs to do the initial copy (always 1 during my
testing)
2. The last one is always when offset is the size of the file.
The second syscall is used to detect the terminating condition. However,
because we do a stat for other reasons, we know the size of the file,
and we can skip the syscall.
Sparse files: since copy_file_range expands holes of sparse files, I
conclude that this layer was not intended to work with sparse files. In
other words, this commit does not make it worse for sparse file society.
Test program
------------
const std = @import("std");
pub fn main() !void {
const arg1 = std.mem.span(std.os.argv[1]);
const arg2 = std.mem.span(std.os.argv[2]);
try std.fs.cwd().copyFile(arg1, std.fs.cwd(), arg2, .{});
}
Test output (current master)
----------------------------
Observe two `copy_file_range` syscalls: one with 209 bytes, one with
zero:
$ zig build-exe cp.zig
$ strace ./cp ./cp.zig ./cp2.zig |& grep copy_file_range
copy_file_range(3, [0], 5, [0], 4294967295, 0) = 209
copy_file_range(3, [209], 5, [209], 4294967295, 0) = 0
$
Test output (this diff)
-----------------------
Observe a single `copy_file_range` syscall with 209 bytes:
$ /code/zig/build/zig build-exe cp.zig
$ strace ./cp ./cp.zig ./cp2.zig |& grep copy_file_range
copy_file_range(3, [0], 5, [0], 4294967295, 0) = 209
$
Due to the unavailability of fchdir in Windows, a call for setting the
CWD needs to either call chdir with the path string or call
SetCurrentDirectory.
Either way, since we are dealing with a Handle in Windows, a call for
GetFinalPathNameByHandle is necessary for getting the file path first.
Windows requires the directory handle to be closed before attempting to delete the directory, so now we do that and then re-open it if we need to retry (from getting DirNotEmpty when trying to delete).
This was sized large so that `getdents` (and other platforms' equivalents) could provide large amounts of entries per syscall, but some benchmarking seems to indicate that the larger 8192 sizing doesn't actually lead to performance gains outside of edge cases like extremely large amounts of entries within a single directory (e.g. 25,000 files in one directory), and even then the gains are minimal ('./walk-8192 dir-with-tons-of-entries' ran 1.02 ± 0.34 times faster than './walk-1024 dir-with-tons-of-entries').
Note: Sizes 1024 and 2048 had similar performance characteristics, so the smaller of the two was chosen.
`deleteTree` now uses a stack-allocated stack for the first 16 nested directories, and then falls back to the previous implementation (which only keeps 1 directory open at a time) when it runs out of room in its stack. This allows the function to perform as well as a recursive implementation for most use-cases without needing allocation or introducing the possibility of stack overflow.
There are two parts to this:
1. The deleteFile call on the sub_path has been moved outside the loop, since if the first call fails with `IsDir` then it's very likely that all the subsequent calls will do the same. Instead, if the `openIterableDir` call ever hits `NotDir` after the `deleteFile` hit `IsDir`, then we assume that the tree was deleted at some point and can consider the deleteTree a success.
2. Inside the `dir_it.next()` loop, we look at entry.kind and only try doing the relevant (deleteFile/openIterableDir) operation, but always fall back to the other if we get the relevant error (NotDir/IsDir).
Before this commit, the modified test would fail with `FileNotFound` because the `entry.dir` would be for the entry itself rather than the containing dir of the entry. That is, if you were walking a tree of `a/b`, then (previously) the entry for `b` would incorrectly have an `entry.dir` for `b` rather than `a`.
`getdents` on Linux can return `ENOENT` if the directory referred to by the fd is deleted during iteration. Returning null when this happens makes sense because:
- `ENOENT` is specific to the Linux implementation of `getdents`
- On other platforms like FreeBSD, `getdents` returns `0` in this scenario, which is functionally equivalent to the `.NOENT => return null` handling on Linux
- In all the usage sites of `Iterator.next` throughout the standard library, translating `ENOENT` returned from `next` as null was the best way to handle it, so the use-case for handling the exact `ENOENT` scenario specifically may not exist to a relevant extent
Previously, ENOENT being returned would trigger `os.unexpectedErrno`.
Closes#12211
This change adds support for locating the Zig executable and the library
and global cache directories, based on looking in the fixed "/zig" and
"/cache" directories.
Since our argv[0] on WASI is just the basename (any absolute/relative
path information is deleted by the runtime), there's very limited
introspection we can do on WASI, so we rely on these fixed directories.
These can be provided on the command-line using `--mapdir`, as follows:
```
wasmtime --mapdir=/cwd::. --mapdir=/cache::"$HOME/.cache/zig" --mapdir=/zig::./zig-out/ ./zig-out/bin/zig.wasm
```
The call to `makeDir` for the top-level component of `sub_path`
can return `error.FileNotFound` if the directory represented by
`self` has been deleted.
Fixes#11397
This adds a special CWD file descriptor, AT.FDCWD (-2), to refer to the
current working directory. The `*at(...)` functions look for this and
resolve relative paths against the stored CWD. Absolute paths are
dynamically matched against the stored Preopens.
"os.initPreopensWasi()" must be called before std.os functions will
resolve relative or absolute paths correctly. This is asserted at
runtime.
Support has been added for: `open`, `rename`, `mkdir`, `rmdir`, `chdir`,
`fchdir`, `link`, `symlink`, `unlink`, `readlink`, `fstatat`, `access`,
and `faccessat`.
This also includes limited support for `getcwd()` and `realpath()`.
These return an error if the CWD does not correspond to a Preopen with
an absolute path. They also do not currently expand symlinks.
Implements a cross-platform metadata API, aiming to reduce unnecessary Unix-dependence of the `std.fs` api. Presently, all OSes beside Windows are treated as Unix; this is likely the best way to treat things by default, instead of explicitly listing each Unix-like OS.
Platform-specific operations are not provided by `File.Metadata`, and instead are to be accessed from `File.Metadata.inner`.
Adds:
- File.setPermissions() : Sets permission of a file according to a `Permissions` struct (not available on WASI)
- File.Permissions : A cross-platform representation of file permissions
- Permissions.readOnly() : Returns whether the file is read-only
- Permissions.setReadOnly() : Sets whether the file is read-only
- Permissions.unixSet() : Sets permissions for a class (UNIX-only)
- Permissions.unixGet() : Checks a permission for a class (UNIX-only)
- Permissions.unixNew() : Returns a new Permissions struct to represent the passed mode (UNIX-only)
- File.Metadata : A cross-platform representation of file metadata
- Metadata.size() : Returns the size of a file
- Metadata.permissions() : Returns a `Permissions` struct, representing permissions on the file
- Metadata.kind() : Returns the `Kind` of the file
- Metadata.accessed() : Returns the time the file was last accessed
- Metadata.modified() : Returns the time the file was last modified
- Metadata.created() : Returns the time the file was created (this is an optional, as the underlying filesystem, or OS may not support this)
Methods of `File.Metadata` are also available for the below, so I won't repeat myself
The below may be used for platform-specific functionality
- File.MetadataUnix : The internal implementation of `File.Metadata` on Unices
- File.MetadataLinux : The internal implementation of `File.Metadata` on Linux
- File.MetadataWindows : The implementation of `File.Metadata` on Windows
This fixes the use of multiple `Iterator`s in a row on a directory.
Previously, on many platforms, using an `Iterator` on an
already-iterated directory would give no entries.
Fixing this involved seeking to the beginning of the directory on the
first call of `next()`.
The semantics of this function are that it moves both files and
directories. Previously we had this `is_dir` boolean field of
`std.os.windows.OpenFile` which required the API user to choose: are we
opening a file or directory? And the other kind would either cause
error.IsDir or error.NotDir. But that is not a limitation of the Windows
file system API; it was self-imposed.
On Windows, rename is implemented internally with `NtCreateFile` so we
need to allow it to open either files or directories. This is now done
by `std.os.windows.OpenFile` accepting enum{file_only,dir_only,any}
instead of a boolean.
All Zig code is eligible to `@import("builtin")` which is mapped to a
generated file, build.zig, based on the target and other settings.
Zig invocations which share the same target settings will generate the
same builtin.zig file and thus the path to builtin.zig is in a shared
cache folder, and different projects can sometimes use the same file.
Before this commit, this led to race conditions where multiple
invocations of `zig` would race to write this file. If one process
wanted to *read* the file while the other process *wrote* the file, the
reading process could observe a truncated or partially written
builtin.zig file.
This commit makes the following improvements:
- limitations:
- avoid clobbering the inode, mtime in the hot path
- avoid creating a partially written file
- builtin.zig needs to be on disk for debug info / stack trace purposes
- don't mark the task as complete until the file is finished being populated
(possibly by an external process)
- strategy:
- create the `@import("builtin")` `Module.File` during the AstGen
work, based on generating the contents in memory rather than
loading from disk.
- write builtin.zig in a separate task that doesn't have
to complete until the end of the AstGen work queue so that it
can be done in parallel with everything else.
- when writing the file, first stat the file path. If it exists, we are done.
- otherwise, write the file to a temp file in the same directory and atomically
rename it into place (clobbering the inode, mtime in the cold path).
- summary:
- all limitations respected
- hot path: one stat() syscall that happens in a worker thread
This required adding a missing function to the standard library:
`std.fs.Dir.statFile`. In this commit, it does open() and then fstat()
which is two syscalls. It should be improved in a future commit to only
make one.
Fixes#9439.
* add team_info, area_info
* update signature for get_next_image_info
* add error checks for haiku system calls
* update and cleanup of haiku constants