| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by thayne 457 days ago
	PATH isn't just handled by the shell though. Many (but not all!) of the exec* family of functions in libc respect PATH.

3 comments

opello 457 days ago

It seems too far to go to say that because a system library holds some implementation details that the responsibility doesn't lie with the program using them. There's all sorts of complex interdependent details that make those kind of boundary distinctions difficult in many operating systems.

matheusmoreira 457 days ago

On Linux the main boundary between user space and kernel is quite clear: the system call layer. It is stable and well documented.

https://github.com/torvalds/linux/blob/master/Documentation/...

System libraries like glibc are not part of the kernel, they are just components that can be replaced.

I wrote an article about it:

https://www.matheusmoreira.com/articles/linux-system-calls

I even asked Greg Kroah-Hartman about it:

https://old.reddit.com/r/linux/comments/fx5e4v/im_greg_kroah...

> So we rely on different libc projects to provide this, and work with them when needed.

> This ends up being more flexible as there are different needs from a libc, and for us to "pick one" wouldn't always be fair.

> And yes, you can just use a "nolibc" type implementation of you like.

> I know I do that for new syscalls when working on them, there's nothing stopping anyone else from doing that as well.

You can trash the entire GNU system and rewrite it all in Rust or Lisp if you wanted. It doesn't have to be some POSIX-like thing either, it could be whatever you wanted it to be. It doesn't need to have things like PATH. You could write a static freestanding application and boot Linux directly into it.

Nobody does stuff like this it's a lifetime of work. But it could be done.

opello 457 days ago

That is indeed one of the more well defined boundaries in the system. Also worth understanding is that programs aren't generally invoking system calls directly, for example calling interrupt 0x80, glibc provides wrapper functions that invoke system calls, blurring the boundary a bit. Further blurring the boundary is the vDSO layer that intercepts some system call wrappers for more efficient access.

At issue in this article and comment thread is the boundary between the shell, environment, and Linux. This is a blurrier boundary still because the shell sets up the environment, which is passed through the kernel, and interpreted for downstream processes, generally (but not necessarily) by that shared system library.

wahern 457 days ago

> vDSO layer that intercepts some system call wrappers for more efficient access.

Technically the vDSO library doesn't intercept. libc chooses to use either the vDSO or the syscall. This can happen either in the wrapper itself, or through a special PLT helper where the linker asks libc to resolve the symbol to populate the GOT entry. vDSO symbols have the prefix __kernel_ or __vdso_.

opello 457 days ago

That's fair, sorry for my casual language in a technically nuanced discussion. I hadn't looked at this in quite a while, but it was good to review. Thanks for the prompting.

https://github.com/bminor/glibc/blob/glibc-2.41/sysdeps/unix...

https://github.com/bminor/glibc/blob/glibc-2.41/sysdeps/unix...

wahern 457 days ago

I also double-checked the glibc and musl code to make sure I wasn't misremembering, and ended up learning about IFUNC.[1] Previously I had avoided going down the rabbit hole to understand what glibc's libc_ifunc was doing. I don't think musl uses IFUNC, at least not for clock_gettime; it seems to always link the wrapper which calls the vdso through an internally managed pointer.[2]

And now I'm wondering how safe all this indirection is. For the PLT/GOT approach I think you can disable lazy binding and force the GOT table to be read-only so exploits can't overwrite the symbol addresses. But for musl's approach it doesn't seem like you can make it's internal function pointer read-only, though maybe it's more difficult to find the address of than GOT table slots.

[1] https://sourceware.org/glibc/wiki/GNU_IFUNC [2] https://git.musl-libc.org/cgit/musl/tree/src/time/clock_gett...

im3w1l 457 days ago

Modern binaries use the syscall instruction instead of int 0x80. The latter still works though.

matheusmoreira 457 days ago

> Also worth understanding is that programs aren't generally invoking system calls directly

They don't generally do that but they absolutely can. I wrote a Lisp interpreter that does just that. It's completely static, has zero dependencies and talks to the kernel directly. The idea is to implement every primitive on top of Linux, and everything else on top of the primitives.

From the kernel's perspective, every program is talking to it directly. They just typically use glibc routines to do it for them. There's no actual need for glibc to be there though.

At some point I even tried adding Linux system call builtins to GCC so that the compiler itself would generate the code in the correct calling convention. Lost that work due to a hard disk crash but on the mailing list I didn't get the impression the maintainers favored merging it anyway.

> for example calling interrupt 0x80, glibc provides wrapper functions that invoke system calls, blurring the boundary a bit

Not all of them. It still doesn't support all of the clone system calls.

https://www.man7.org/linux/man-pages/man2/clone.2.html

  Note: glibc provides no wrapper for clone3(),
        necessitating the use of syscall(2).

It's not just niche system calls either. It took years for glibc to provide getrandom.

https://www.man7.org/linux/man-pages/man2/getrandom.2.html

https://lwn.net/Articles/711013/

It's really annoying how these glibc wrappers get confused with the actual Linux system calls which work very differently. The most notable difference is there's no global thread local errno nonsense with the real system calls, the kernel just gives you a perfectly normal return value in a register. There's also a ton of glibc machinery related to system call cancellation that gets linked in if you use it.

Documentation out there conflates the two. I expected the man page above to describe only the Linux system call but it also describes the glibc specific stuff. That way people get the impression they are one and the same.

> Further blurring the boundary is the vDSO layer that intercepts some system call wrappers for more efficient access.

The vDSO is a documented stable Linux kernel interface:

https://github.com/torvalds/linux/blob/master/Documentation/...

It's just a perfectly normal ELF shared object that the kernel maps into the address space of every process on certain architectures. Its address is passed via the auxiliary vector which is located immediately after the environment vector. Glibc merely finds it and uses it. I can make my interpreter use it too.

It's completely optional. Its purpose is making certain system calls faster by eliminating the switch to kernel mode. This is useful for time/date system calls which are invoked frequently. The original system calls are still available though.

> This is a blurrier boundary still because the shell sets up the environment, which is passed through the kernel, and interpreted for downstream processes, generally (but not necessarily) by that shared system library.

The shell passes the environment to the execve system call but the kernel does not interpret it in any way. It doesn't even enforce the "key=value" format since this is just a convention. It's essentially an opaque array of strings and it's up to user space to make sense of whatever it contains. Glibc chooses to parse those strings into program state in the form of environment variables whose values programmers can query.

Joker_vD 457 days ago

> It took years for glibc to provide getrandom.

A tangent: Robert Clausecker, the guy who submitted the proposal for adding tcgetwinsize() and SIGWINCH to POSIX, apparently did it because it "is probably the easiest way to get glibc to implement a feature you want" [0].

[0] https://news.ycombinator.com/item?id=42041467

opello 457 days ago

My use of "blurry" is because you asserted a clear boundary between user and kernel space. While I agree that this boundary is well-defined, it is indeed "blurred" (made less clear) by the glibc function wrappers and vDSO injected functions. Because the glibc library is a system library and the vDSO is a blob of library code mapped in the kernel. It's not a simple interrupt to context switch and return when complete with state having been updated from "over the fence."

To me the description of a "clear boundary" should avoid the amount of nuance around whether the application's call lands in a library or the kernel's syscall handler. The fact that it doesn't means that the boundary is less clear, or blurry as was the term I adopted here.

yencabulator 455 days ago

> You could write a static freestanding application and boot Linux directly into it.

> Nobody does stuff like this it's a lifetime of work. But it could be done.

1) Go binaries on Linux don't need libc (except for NSS, which is glibc-only idiocy).

2) Running a static libc-less binary as init is fun and easy! https://github.com/tv42/alone https://gokrazy.org/

matheusmoreira 452 days ago

> Go binaries on Linux don't need libc (except for NSS, which is glibc-only idiocy).

They don't need NSS either, they just chose to depend on it because getting rid of if was too painful. Everyone's addicted to glibc and nobody enjoys going through the withdrawal symptoms.

> Running a static libc-less binary as init is fun and easy!

It's incredibly fun. I wrote a freestanding Linux Lisp interpreter and my long term goal is to boot Linux into it and bring up the entire system from within the REPL.

https://github.com/lone-lang/lone

I haven't tested it but I bet Linux can already boot into it just fine. I still need to do more work to make it able to bring up the system though. I've implemented endianness independent memory access functions, now I need a binary structure encoder and decoder. That will enable programs to do kernel I/O properly.

What's holding me back right now is continuations. Language still doesn't have flow control. I've been slowly converting my interpreter into an interruptible virtual machine with continuations that can integrate with primitives written in C. Debugging this stuff turned out to be a nightmare.

matheusmoreira 457 days ago

Those functions aren't the real system calls provided by Linux, they're just glibc wrappers with added functionality. Linux kernel execve has absolutely no concept of PATH, it just opens the file at the provided pathname. That's a good thing too, user space might want to customize that stuff.

thayne 457 days ago

Sure, but it is also not the same thing as the shell.

matheusmoreira 457 days ago

Yes. Shells typically do their own path resolution as well. I know GNU bash does, at least. I customized that logic in order to make a little library system for shell scripts.

kccqzy 457 days ago

There is also paths.h usually located at /usr/include/paths.h. It contains the default PATH macro _PATH_DEFPATH.