Hacker News new | ask | show | jobs
by NobodyNada 58 days ago
Nearly every available filesystem API in Rust's stdlib maps one-to-one with a Unix syscall (see Rust's std::fs module [0] for reference -- for example, the `File` struct is just a wrapper around a file descriptor, and its associated methods are essentially just the syscalls you can perform on file descriptors). The only exceptions are a few helper functions like `read_to_string` or `create_dir_all` that perform slightly higher-level operations.

And, yeah, the Unix syscalls are very prone to mistakes like this. For example, Unix's `rename` syscall takes two paths as arguments; you can't rename a file by handle; and so Rust has a `rename` function that takes two paths rather than an associated function on a `File`. Rust exposes path-based APIs where Unix exposes path-based APIs, and file-handle-based APIs where Unix exposes file-handle-based APIs.

So I agree that Rust's stdilb is somewhat mistake prone; not so much because it's being opinionated and "nudg[ing] the developer towards using neat APIs", but because it's so low-level that it's not offering much "safety" in filesystem access over raw syscalls beyond ensuring that you didn't write a buffer overflow.

[0]: https://doc.rust-lang.org/std/fs/index.html

2 comments

> So I agree that Rust's stdilb is somewhat mistake prone; not so much because it's being opinionated and "nudg[ing] the developer towards using neat APIs", but because it's so low-level that it's not offering much "safety" in filesystem access over raw syscalls beyond ensuring that you didn't write a buffer overflow.

`openat()` and the other `*at()` syscalls are also raw syscalls, which Rust's stdlib chose not to expose. While I can understand that this may not be straight forward for a cross-platform API, I have to disagree with your statement that Rust's stdlib is mistake prone because it's so low-level. It's more mistake prone than POSIX (in some aspects) because it is missing a whole family of low-level syscalls.

openat() is there, but it's unstable (because the dirfd-related syscalls are not all fully implemented and tested across all platforms Rust supports yet): https://doc.rust-lang.org/std/fs/struct.Dir.html#method.open...
There are lots of unstable things in Rust that have been unstable for many years, and the intentional segregating of unstable means that it's a nonstarter for most use cases, like libraries. It's unstable because there's significant enough issues that nobody wants to mark it as stable, no matter what those issues are.

As long as it's unstable it's totally fair to say Rust's stdlib does not expose them. You might as well say it's fixed because someone posted a patch on a mailing list somewhere.

There are lots of unstable things in Rust that have been unstable for many years, but this isn't one of them. openat() was added in September, and the next PR in the series implementing unlinkat() and removeat() received a code review three weeks ago and is currently waiting on the author for minor revisions.

> As long as it's unstable it's totally fair to say Rust's stdlib does not expose them. You might as well say it's fixed because someone posted a patch on a mailing list somewhere

Agreed. My comment was intended to be read as "it's planned and being worked on", not "it's available".

They're not missing, Rust just ships them (including openat) as part of the first-party libc crate rather than exposing them directly from libstd. You'll find all the other libc syscalls there as well: https://docs.rs/libc/0.2.186/libc/ . I agree that Rust's stdlib could use some higher-level helper functions to help head off TOCTOU, but it's not as simple as just exposing `openat`, which, in addition to being platform-specific as you say, is also error-prone in its own right.
But those are all unsafe, taking raw strings.

Why can I easily use "*at" functions from Python's stdlib, but not Rust's?

They are much safer against path traversal and symlink attacks.

Working safely with files should not require *const c_char.

This should be fixed .

> But those are all unsafe, taking raw strings.

The parent was asking for access to the C syscall, and C syscalls are unsafe, including in C. You can wrap that syscall in a safe interface if you like, and many have. And to reiterate, I'm all for supporting this pattern in Rust's stdlib itself. But openat itself is a questionable API (I have not yet seen anyone mention that openat2 exists), and if Rust wanted to provide this, it would want to design something distinct.

> Why can I easily use "*at" functions from Python's stdlib, but not Rust's?

I'm not sure you can. The supported pattern appears to involve passing the optional `opener` parameter to `os.open`, but while the example of this shown in the official documentation works on Linux, I just tried it on Windows and it throws a PermissionError exception because AFAIK you can't open directories on Windows.

I took parent's message to be asking why the standard library fs primitives don't use `at` functions under the hood, not that they wanted the `at` functions directly exposed.

> which Rust's stdlib chose not to expose

i.e. expose through things like `File::open()`.

> why the standard library fs primitives don't use `at` functions under the hood

In this case it wouldn't seem to make sense to use `at` functions to back the standard file opening interface that Rust presents, because it requires different parameters, so a different API would need to be designed. Someone above mentioned that such an API is being considered for inclusion in libstd in this issue: https://github.com/rust-lang/rust/issues/120426

> AFAIK you can't open directories on Windows.

You can but you have to go through the lower level API: NtCreateFile can open a directory, and you can pass in a RootDirectory handle to following calls to make them handle-relative.

You can open directories using high level win32 APIs. What you need NtCreateFile for is opening files relative to an open directory.
The nix crate provides the safe wrappers. https://docs.rs/nix/latest/nix/fcntl/fn.openat2.html
The correct comparison is to rustix, not libc, and rustix is not first-party. And even then the rustix API does not encapsulate the operations into structs the same way std::fs and std::io do.
The correct comparison to someone asking for first-party access to a C syscall is to the first-party crate that provides direct bindings to C syscalls. If you're willing to go further afield to third-party crates, you might as well skip rustix's "POSIX-ish" APIs (to quote their documentation) and go directly to the openat crate, which provides a Rust-style API.
If I have to use unsafe just to open a file, I might as well use C. While Rustix is a happy middle that is usually enough and more popular than the open at crate, libc is in the same family as the "*-sys" crate and, generally speaking, it is not intended for direct use outside other FFI crates.
I agree it’d be nice if there were a safe stdlib openat API, but

> If I have to use unsafe just to open a file, I might as well use C.

is a ridiculous exaggeration.

> For example, Unix's `rename` syscall takes two paths as arguments; you can't rename a file by handle

And then there’s renameat(2) which takes two dirfd… and two paths from there, which mostly has all the same issues rename(2) does (and does not even take flags so even O_NOFOLLOW is not available).

I’m not sure what you’d need to make a safe renameat(), maybe a triplet of (dirfd, filefd, name[1]) from the source, (dirfd, name) from the target, and some sort of flag to indicate whether it is allowed to create, overwrite, or both.

As the recent https://blog.sebastianwick.net/posts/how-hard-is-it-to-open-... talks about (just for file but it applies to everything) secure file system interaction is absolutely heinous.

[1]: not path

How about fd of the file you wanna rename, dirfd of the directory you want to open it in, and name of the new file? You could then represent a "rename within the same directory" as: dfd = opendir(...); fd = openat(dfd, "a"); rename2(fd, dfd, "b");

I can't think of a case this API doesn't cover, but maybe there is one.

The file may have been renamed or deleted since the fd was opened, and it might have been legitimate and on purpose, but there’s no way to tell what trying to resolve the fd back to a path will give you.

And you need to do that because nothing precludes having multiple entries to the same inode in the same directory, so you need to know specifically what the source direntry is, and a direntry is just a name in the directory file.