At least for x86, you can get this same information fairly easily directly from the source. The table is located at arch/x86/entry/syscalls/syscall_64.tbl, from there you can grep for the function with git grep. For example, git grep 'SYSCALL_DEFINE.*read'.
If you ran something like grep -r SYSCALL_DEFINE.read from the top level of the linux source it would search through not just your source code, but also all of the artifacts of building the kernel. Basically, git grep is faster in this case because it filters the searched files down to only ones that are checked in. You could achieve a similar effect with standard tools like this: find -type f -regex '.\.[hc]' | xargs grep 'SYSCALL_DEFINE.*read'
strace is not a proof. It has it's own built-in table. So also strace could be wrong.
In practice strace is widely used and bugs should be discovered, reported, and fixed soon. So without doing any own analysis I'd bet that in doubt this table is wrong and strace right.
Note that syscall list and numbers are architecture specific. The differences are typically not huge, but they exist.
If man pages were up to date, this should be the index of chapter 2. I have discover unix with sun in the 90s and I am very nostalgic of the quality of man pages. At that time, man pages were complete and up to date. My latest frustration was with the option -m of df command. Chapter 2 should be updated each time a new version of kernel is installed.
It's very strange that adding/updating documentation isn't treated as a basic requirement for a patch that adds to or modifies Linux's public interfaces.
19) All new userspace interfaces are documented in ``Documentation/ABI/``.
See ``Documentation/ABI/README`` for more information.
Patches that change userspace interfaces should be CCed to
linux-api@vger.kernel.org.
That's all cool and everything, but the registers are wrong... Not only are they 32-bit (eax vs. rax), but their order is wrong too - the first argument in x86-64 ABI is rdi, for example.
The registers look correct for the i386 ABI. eax for the system call number, then ebx, ecx, edx, esi, edi, ebp for the next 6 arguments.
I skimmed a couple files in the code. And it seems like it might be parsing this information out of some other sources, and maybe getting confused about the info it's grabbing?
I put out a syscall table back in the day for Linux 2.2 (up to %eax 190). Someone copied it (I'm glad.): https://www.cs.utexas.edu/~bismith/test/syscalls/syscalls32.... They didn't attribute it to me, but I remember a professor did for his class. There were better tables after that I admit, though I liked my version because it linked into the source code.
Nice work. the table has been generated for 4.10 and hence the link to the source code files should also have this kernel version in the path of the url for direct access
You might need that when you want to reimplement Linux, the Joyent team did that on their OS (derived from solaris) so that user can run linux binaries on a solaris kernel (so thay have dtrace, zfs, mdb, ...) Bryan Cantrill did a bunch of conferences on that (one here: https://youtu.be/TrfD3pC0VSs)
The idea behind is that Linux is only a list of syscalls, if you are able to reimplement them, you reimplement linux, you don't need anything else. On the contrary if you want to reimplement a BSD you need to reimplement their libc (and perhaps some other libraries)
I'm not ultra familiar with the topic so if someone wants to correct me please do but :
- Linux has always been described as just a kernel, which translates as just a syscall table. The fact that this table is stable or not is not relevant here.
- *BSD on the other hand are shipping a kernel plus a lot of libraries/binaries, if you want to simulate a BSD system, you have to expose those libraries/binaries.
It's not so much a technical difference, it's more of a different approach to OS development (kernel space vs kernel/user space).
Thing is, if syscalls in BSD are considered stable the way they are in Linux, then you could just ship your own kernel with BSD's libc. But if they consider it an internal API between kernel and libc, and apps are only ever supposed to depend on libc, then of course that doesn't work.
So stability of syscall API is the de facto differentiating factor here. It sounds like Microsoft couldn't do "Windows Subsystem for BSD" the way it did WSL, for example.
I had cause to consult a syscall table (not this one, a correct one, forget where I found it) when doing something or other with fasm. fasm's macros are pretty dang advanced...I remember having argument length checking and rudimentary type checking as well. Then I got done yak shaving and remembered that programming in asm sucks.
But yeah, it's for writing your own syscall wrappers. Something not exported by libc, or more likely if you're not using libc.
Sometimes, yes, you'll need to write your own syscall wrappers. For example, there isn't a gettid (get thread ID) function in Glibc, but you can work around this by calling the syscall directly.
The other case where this is useful is if you're wanting to write userspace assembly without calling a C library. This may be especially useful when you're writing a compiler, or if you're trying to write small shellcodes for some reason.
Or to try making more sense of some libc implementation. The syscall stuff in glibc and musl both have a good bit of preprocessor voodoo to make syscalls feel more like function calls.
The mere fact that we are debating over the correctness of this table confirms the quality of the documentation of the OS we base our entire civilization upon is pretty poor.
This is great! It would be even more useful to have this for Mac OSX too. A lot of the projects I do ends up being on both Mac and Linux. It's always a pain to find the corresponding number for the system call on Mac.
Looking at this bug, it seems that Go has "fixed" it by fixing the syscall arguments, not by switching to libc.
Did they switch to libc since then?
If not, and given that Go apps are normally statically linked, does this mean that any precompiled Go app basically has a time bomb, in a sense that it'll break next time Apple changes some syscall?
The problem go has is that there’s a rather large overhead for calling C functions [1]. So they did not switch to calling libc as far as I know. And yes the next time Apple changes the syscalls, it will break again.
Wow. So, basically, Go is a rather insular ecosystem - since you're paying the overhead of a context switch for every single FFI call - and if you use the stock APIs, it's essentially broken by design on macOS (since it uses APIs that Apple itself does not consider stable).
That's really sad. I was just beginning to like some aspects of it.
Linux doesn't guarantee syscall stability either. Just make sure your wrappers can use a syscall table chosen at runtime, depending on which kernel you are running.
Yes it does, at least in the sense that syscalls which become officially public will never be removed from Linus' tree except in rare circumstances (i.e. proof that nobody is using it), nor will the arguments change. This is Linus' famous "never break user space" ABI mantra. While distributions may deprecate and remove them (e.g. sysctl(2)) they certainly won't be assigned new IDs. A table won't help in such cases.
Exactly. This is why, for example, the original 'mmap' system call entry point on x86 still exists, even though it is overwhelmingly likely that every program on your machine is actually going to use the 'mmap2' entry point.
* Missing syscalls
* Wrong syscall numbers
* Wrong calling convention
* Links to source are to wrong version
Does the table get actually anything right? I mean this is pretty spectacular cascade of failures.