Hacker News new | ask | show | jobs
by steamrolled 369 days ago
The main reason system() exists is that people want to execute shell commands; some confused novice developers might mix it up with execl(), but this is not a major source of vulnerabilities. The major source of vulnerabilities is "oh yeah, I actually meant to execute shell".

So if you just take away the libcall, people will make their own version by just doing execl() of /bin/sh. If you want this to change, I think you have to ask why do people want to do this in the first place.

And the answer here is basically that because of the unix design philosophy, the shell is immensely useful. There are all these cool, small utilities and tricks you can use in lieu of writing a lot of extra code. On Windows, command-line conventions, filesystem quirks, and escaping gotchas are actually more numerous. It's just that there's almost nothing to call, so you get fewer bugs.

The most practical way to make this class of bugs go away is to make the unix shell less useful.

1 comments

most calls to system() that I've seen could be replaced with exec without much difficulty. There's relatively few that actually need the shell functionality.
system() involves fork()ing, setting up signal handlers, exec()ing and wait()ing. You won't be replacing it with exec, most of the time you'll be reimplementing it for absolutely no reason.
Python has os.spawnl, os.spawnv, etc., which fork()s, wait4()s, etc., without involving a shell. This is much better; this is the library function you should be using instead of system() most of the time. Unfortunately I don't think glibc has an equivalent!

    strace -o tmp.spawnlp -ff python3 -c 'import os; os.spawnlp(os.P_WAIT, "true", "true")' 
In parent:

    clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fdc03233310) = 225954
    wait4(225954, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 225954
    --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=225954, si_uid=1000, si_status=0, si_utime=0, si_stime=0} ---
In child:

    set_robust_list(0x7fdc03233320, 24)     = 0
    gettid()                                = 225954
    clock_gettime(CLOCK_MONOTONIC, {tv_sec=2458614, tv_nsec=322829153}) = 0
    clock_gettime(CLOCK_MONOTONIC, {tv_sec=2458614, tv_nsec=323030718}) = 0
    execve("/usr/local/bin/true", ["true"], 0x7ffdc5008458 /* 44 vars */) = -1 ENOENT (No such file or directory)
    execve("/usr/bin/true", ["true"], 0x7ffdc5008458 /* 44 vars */) = 0
Here, I think strace shows clone() rather than fork() because glibc's fork() is a library function that invokes clone(), rather than a real system call.
> Python has os.spawnl, os.spawnv, etc., which fork()s, wait4()s, etc., without involving a shell.

Good. How do you pipeline commands with these?

These functions can't do it. In Python you have to use the subprocess module if you want to pipeline commands without the bugs introduced by the shell. From https://docs.python.org/3.7/library/subprocess.html#replacin...:

    p1 = Popen(["dmesg"], stdout=PIPE)
    p2 = Popen(["grep", "hda"], stdin=p1.stdout, stdout=PIPE)
    p1.stdout.close()  # Allow p1 to receive a SIGPIPE if p2 exits.
    output = p2.communicate()[0]
Of course, now, nobody has an hda, and dmesg is root-only. A more modern example is in http://canonical.org/~kragen/sw/dev3/whereroot.py:

    p1 = subprocess.Popen(["df"], stdout=subprocess.PIPE)
    p2 = subprocess.Popen(["grep", "/$"], stdin=p1.stdout, stdout=subprocess.PIPE)
    p1.stdout.close()
    return p2.communicate()[0]
Note that the result here is a byte string, so if you want to print it out safely without the shell-like bugginess induced by Python's default character handling (what happens if the device name isn't valid UTF-8?), you have to do backflips with sys.stdout.buffer or UTF-8B.

Python got a lot of things wrong, and it gets worse all the time, but for now spawning subprocesses is one of the things it got right. Although, unlike IIRC Tcl, it doesn't raise an exception by default if one of the commands fails.

Apart from the semantics of the operations, you could of course desire a better notation for them. In Python you could maybe achieve something like

    (cmd(["df"]) | ["grep", "/$"]).output()
but that is secondary to being able to safely handle arguments containing spaces and pipes and whatnot.
Dunno, so much work to achieve so little. I'm even more inclined to stick with shell scripts now
There is posix_spawn(). Some operating systems even implement that as a system call (not Linux). Implementing that as a system call has the advantage that spawning a new process from a process that has huge memory mapping is fast, because the memory mappings don't need to be copied (yes, I know the memory is copy on write, but the mappings themselves have to be correctly copied with the information needed for copy on write).