| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by avidiax 661 days ago

It is sometimes used to allow one binary to be the symlink target of hundreds of commands.

Android does this for most common shell commands. Toybox and busybox are examples of such implementations.

https://github.com/landley/toybox

https://en.m.wikipedia.org/wiki/BusyBox

4 comments

cubist_castle 661 days ago

I just learned that rustup/rustc/cargo etc. work like this too. I couldn't understand why the gentoo formula was symlinking the same binary to a bunch of aliases.

link

kbolino 661 days ago

On my system, these are hardlinks (regular files with a link count >1 and the same inode) rather than symlinks, though I'm not sure why.

link

mostthingsweb 661 days ago

Maybe to avoid broken links if you move the original files? That's the main benefit of hardlinks vs symlinks in my mind at least.

link

actionfromafar 661 days ago

That can also be a downside, you believe you have moved stuff but now you can have different versions of programs that don't expect that to be a possibility.

link

ForOldHack 660 days ago

If there is a simlink, a hardlink and an executable, all with the same name, which one will it run? Which one will the shell object to? Which one should the shell object to. If a virus/SUID program overwrites a simlink, no problem, but ift it traces the simlink to the executable, and then over writes that...

link

alerighi 660 days ago

And that makes a lot of sense, especially for binaries that are statically linked (as usually are Rust binaries), since that could save a lot of disk space!

link

duped 661 days ago

clang does this too.

link

mistercow 661 days ago

Also if you want a program to call itself, which is sometimes useful, this way lets you actually call the same program, rather than assuming the name and path.

link

duped 661 days ago

Don't do this - if you (reliably) want the path to the current executable there is no portable way to do it, but on Linux you need to readlink /proc/self/exe and on MacOS you call _NSGetExecutablePath. I forget the API on Windows.

link

theamk 661 days ago

I would not say it in such absolute way - /proc/self/exe has downsides as well. As this resolves all symlinks, so this breaks all the things that depend on argv[0], like nice help messages, python's virtualenv, name-based dispatch, and seeing if the program which was executed via symlink or not.

A lot of times you know you never called chdir(), in which case I'd actually recommend executing argv[0], as this is nicest thing for admins. If you are really worried, you can use /proc/self/exe for progname and pass argv[0] as-is, but that's overkill a lot of times.

link

duped 661 days ago

Those are all cases where you're using argv[0] as an argument to the program where it's appropriate. Using it as the path to spawn a child process is incorrect. You're free to re-use it as an argument.

I have fixed enough software that made this mistake that I'm confident to be absolute about it. It's a very easy mistake to make but it's really annoying when software makes it and someone needs to deal with it at a higher level. It's better for developers to know that argv[0] isn't the path to the executable it's what was used to invoke the executable.

link

vlovich123 660 days ago

What’s the issue with using argv[0] as a way to spawn yourself? I don’t recall running into a lot of issues.

link

duped 660 days ago

If it's a relative path, then changing the working directory will break (chdir("/") is a very common tactic at the top of main()).

It's possible/desirable for the parent to change the PATH of a child process, particularly one that spawns other processes. So the argv[0] used to spawn the original process may be garbage for spawning children.

Similarly in any kind of chroot jail (which may or may not be docker these days), relative paths and PATH can be garbage even if they don't change.

The real problem is that I've seen in-house and open source frameworks/libraries that have a function like `get_executable_path` that reads `argv[0]` and this is just incorrect behavior. Spawning yourself is one of the less risky things you can do, but there are gotchas and a way to avoid them!

link

mbrumlow 661 days ago

I think you forget the exec system call’s first argument is a path to an executable, followed by an array of arguments, where arg[0] lives.

I can’t find issue with exec(“/proc/self/exe”, [ program , … ).

link

alerighi 660 days ago

Well, it could be for example that /proc is not mounted. A lot of software breaks for this, while really there is no need for it to be so. Also that approach only works on Linux, if you want to write a portable software what you do?

link

mbrumlow 660 days ago

I am mainly pointing out that arg[0] is still valid. Writing portable software is an entirely different topic.

link

sweetjuly 661 days ago

Note though that both of these solutions are racy and so should not be done if "someone symlinking really fast and swapping the binaries" is in your threat model. Linux proc/self is safe though, just not the result from readlink.

link

duped 660 days ago

Well that's true, but also something that can't be addressed within a currently running process afaik.

link

flohofwoe 661 days ago

There's also this very handy and tiny cross-platform library:

https://github.com/gpakosz/whereami

link

ForOldHack 660 days ago

Four cardinal sins of programming: 1. Self modifying code. ( The word 'recalcitrant' comes to mind. 2. calling your own program to execute itself. 3. Interrupting the flow of control with a jump. 4. Non-graceful exit. 5. Renaming 'hack' as 'vi' or 'ps'

link

SoftTalker 661 days ago

There's no guarantee that the name and the path are still the same executable that is running, or that they even exist anymore.

link

wang_li 661 days ago

In most of the variants of exec*() there are separate arguments for the thing to be executed and the *argv[] list. Argv[0] being the executable is just a convention. In perl $ARGV[0] is the first positional parameter. In

    $ perl myscript.pl a b c

$ARGV[0] is "a".

link

mistercow 661 days ago

I mean sure. All software is built on assumptions. Make sure the assumptions you’re making are appropriate in context.

link

wongarsu 660 days ago

Unless you are on Windows

link

glandium 660 days ago

You can actually rename an executable that is running, on Windows. That's a way to handle self updates: rename the executable, create its replacement, execute the new one to make it remove the old executable.

link

akira2501 661 days ago

Beware TOC TOU problems when doing this.

link

fallingsquirrel 661 days ago

You can do this without assuming the name by execing /proc/$PID/exe. Then you're not vulnerable to the argv[0] spoofing described in the article. (But of course since argv[0] does exist, you should set it properly and pass through your own argv[0] unchanged.)

link

dpassens 661 days ago

That's not portable, though. OpenBSD, for example, doesn't have /proc.

link

hnlmorg 661 days ago

That’s Linux only. Wouldn’t even work on macOS, which would likely be a significant number of your users.

link

hi-v-rocknroll 661 days ago

coreutils-static did this too. The advantage of shared libraries and multiple-use single static binaries is they're only loaded once.

link

layer8 661 days ago

The article discusses this.

link