Hacker News new | ask | show | jobs
by tschneidereit 2637 days ago
(Member of the team at Mozilla here )

Yes, that's the list.

And the layout of structs, strings, etc is up to the compiler, within the bounds of the restrictions WebAssembly imposes.

We'll definitely have a test suite, but this is all early days, so a lot of all that isn't yet in place.

And yes, this can be targeted by LLVM-based and other compilers. In fact, Emscripten could use this as the foundation for their POSIX-like libc and library packages. The syscalls are indeed exposed as Wasm function imports.

1 comments

Will WASI normalize differences between platforms? e.g. convert argv or paths to a consistent character encoding?
Yes!
How will you deal with valid paths such as:

  /tmp/[DE][AD][BE][EF].txt # ext2 / linux
  # OR
  C:\stuff\[DEED][FFFE].txt # ntfs / windows
  # where [hex] indicates a single filesystem charater with that value
One fun thing about the capability model is that at the system call level, there are no absolute paths. All filesystem path references are relative to base directory handles. So even if an application thinks it wants something in C:\stuff, it's the job of the libraries linked into the application to map that to something that can actually be named. So there's room for the ecosystem to innovate, above the WASI syscall layer, on what "C:\" should mean in an application intending to be portable.

Concerning character encodings, and potentially case sensitivity, the current high-level idea is that paths at the WASI syscall layer will be UTF-8, and WASI implementations will perform translation under the covers as needed. Of course, that doesn't fix everything, but it's a starting point.

That’s good to know, but the parent’s examples seem to be referencing the issue of filenames that aren’t valid Unicode. The Linux example is invalid UTF-8, since Linux filenames are natively arbitrary byte sequences. The Windows example contains an unpaired surrogate followed by the reserved codepoint 0xfffe, since Windows filenames are natively UCS-2.
WTF-8 could solve the windows issue and I think for Linux it’s time to demand unicode filenames :)
What about platform differences like how file permissions work on windows vs posix? (i.e., stuff that Python does not fully normalize)
I have a dollar that says all platform difference issues will be solved by just doing whatever POSIX does and expecting the host OS to figure it out if it isn't already POSIX. Whenever you try to abstract away arbitrarily different implementations while retaining their non-common functionality you either end up reimplementing one of them and expecting the others to work around it, or you end up forcing the programmer to bypass the abstraction anyway and implement logic for each implementation.
I wouldn't count on that.

I have worked on file APIs. There are so many differences between Windows and Posix that abstracting them away just doesn't work. Undoubtedly, there will eventually be platform-specific APIs that implement one or the other, and cross-platform APIs that implement the intersection.

It's a good question. WASI currently doesn't allow you to set custom access-control permissions when creating files. But we're just getting started, so if we can find a design that works, we can add it.