Hacker News new | ask | show | jobs
by wietze 660 days ago
For busybox/toybox the argv[0] thing is great, and seems to be the prime example of why argv[0] shouldn't go - yet it is a bit of an anomaly in how argv[0] is used.

If there really is a need for having one executable that comprises multiple commands, is `busybox whoami` instead of `whoami` so much more effort? To me, that would make more sense in terms of what is going on; aliases could be used if one-word commands are preferred. In most non busybox contexts, argv[0] is just an unnecessary addition that, as the linked article shows, can introduce weirdness.

It's clear from the comments there are still many who think argv[0] is a good thing, which is great - I'm glad the post sparked this debate.

7 comments

> is `busybox whoami` instead of `whoami` so much more effort?

It's not the "more effort" that is the deal breaker here. It is a matter of compliance with specs and user expectations. What you're suggesting would make Busybox very non-POSIXy, very non-Unixy. All scripts written over the last many decades would need to be updated to call `busybox ls` instead of `ls`? How is that a viable solution?

> I'm glad the post sparked this debate.

This is a very strange way to deflect concerns about quality of the article!

Yeah. The whole point of busybox is to provide the POSIX commands in one compact executable. Making things work any other way defeats the entire purpose of busybox.
In other words: `busybox` is primarily an implementation of a _standard library_ and only secondarily a command line tool, so it _must_ use the standard names.
Given that 'alias' is in POSIX, would a combination of

(a) the hypothetical non-argv0 busybox being discussed, and

(b) a POSIX shell of the maintainer's choice, with built-in aliases for 'ls=busybox ls'

be sufficient to make the system POSIX complaint?

aliases are not inherited by subprocesses unfortunately! so the alias solution would not work when a shell script launches other shell scripts. It wouldn't work in a wide range of other scenarios too like Makefiles, bespoke build tools, binary executables that do execve("/usr/bin/cmp", ...) etc.
In addition to the already-raised issue of subprocesses not inheriting aliases, I'd also be worried about aliases inherently being specific to particular shells. I'd hate to have to redefine those aliases for sh, csh, zsh, fish, and Lord knows what else. It'd also be an issue for invoking those tools without going through a shell in the first place - as is common for programs launching external programs as subprocesses.

That's indeed why I personally don't use shell aliases at all, instead opting for actual shell scripts in my $PATH. Those will work no matter what shell I'm using (if any).

`busybox whoami` is probably fine, but having to write `busybox ls`, `busybox grep`, `busybox cp` etc. would get tedious quickly.

Shell aliases don't solve all problems, even if you do:

    alias rm="busybox rm"
    alias xargs="busybox xargs"
    # etc.
you still have to write `xargs -exec busybox rm`, because xargs won't use the shell alias.

But the main problem with this approach is that POSIX and LSB require certain binaries to be available at certain paths. When they're not, most shell scripts will just break.

The minimal standard solution is probably to create shell scripts for all of these, e.g. in /bin/ls:

    #!/bin/sh
    exec /bin/busybox ls
But this both adds runtime overhead (on every invocation!) and is quite wasteful in terms of disk space. Busybox boasts over 400 tools. At 4 KB per file, that's 1.6 MiB of just shell scripts. Of course that can be less if the file system uses some type of compression which is common on embedded systems where storage space is small, but it still seems to defeat the purpose of using busybox to create a minimal system.
Well /bin/sh is also busybox, so I think you'd need

    #!/bin/busybox sh
    exec /bin/busybox ls

?
Great point!

Actually this observation invalidates the whole setup. Because even though you could define /bin/sh itself as:

    #!/bin/busybox sh
    exec /bin/busybox sh
Then you still cannot use #!/bin/sh in any other shell scripts, because for historical reasons the interpreter of a script is not allowed to be another interpreted script, it must be a binary. So /bin/sh pretty much has to be an actual binary.
Yes. Anybody who has shipped software would say.

I really don’t think it is a debate. The usage of arg[0] is massively understated by the article. Just go look at gcc or any modern day compiler. Its use so much that the conversion of should we has been hashes out by many different groups yet they still chose to implement it.

The security concerns are a non issue. As arg[0] was not the problem. It was the lack of technical knowledge of how systems work and a flaw in the security application.

I think you’re both forgetting that bash has been using this trick for decades.

Bash has an sh compatibility mode that runs when you invoke it as sh.

Well of course it's not only a matter of interactive usage (even because the busybox itself shell could do the conversion). The problem are script, or worse programs that invokes commands as subprocesses (programs that maybe you don't have have access to the source code!).

What you do? Replace every single occurrence of each command by prefixing it `busybox`? Not ideal at all...

https://pubs.opengroup.org/onlinepubs/9699919799/

You appear not to realize that busybox is an essential component of a POSIX like system.

That's fine for when users are interactively typing commands, but it doesn't work when the command is being run by a non-busybox program which expects commands to exist in the standard locations.