Hacker News new | ask | show | jobs
by LukeShu 3034 days ago
Not to be negative, but for learning: There are a few "problems" with this snippet of the article:

----

    $ which cd
    /usr/bin/cd
    $ cat /usr/bin/cd
    #!/bin/sh
    # $FreeBSD: src/usr.bin/alias/generic.sh,v 1.2 2005/10/24 22:32:19 cperciva Exp $
    # This file is in the public domain.
    builtin `echo ${0##*/} | tr \[:upper:] \[:lower:]` ${1+"$@"}
Oh, bother! Reading shell scripts can be such a hassle sometimes. I know the tr command is used to translate characters. In this particular case, the second half of the command, the part after the pipe symbol, basically converts the command cd dev to CD dev. I have no idea why this is. In any case, this modified command is passed to the builtin command which is handled by the shell (Bourne shell) that we are using.

----

1. The first mistake is typing `which cd`. `which` is a separate program that looks things up in $PATH, which may not actually be what happens when you run the command. You should have used `type cd`:

    $ type cd
    cd is a shell builtin
As you discover later in the article, `cd` must be a shell builtin. Which makes it a little mysterious (and interesting!) why the file /usr/bin/cd exists; it won't really do anything, try it:

    $ pwd
    /home/lukeshu
    $ /usr/bin/cd /usr
    $ pwd
    /home/lukeshu
    $ # but it will print error messages
    $ /usr/bin/cd /bogus
    /usr/bin/cd: line 4: cd: /bogus: No such file or directory
So, why does /usr/bin/cd exist? The comment with the CVS ID gives us a hint: It's a common "src/usr.bin/alias/generic.sh" that is copied (hard-linked) in to /usr/bin for several shell builtins ( https://github.com/freebsd/freebsd/blob/0bc1bed704cc7b7292be... ). For other builtins that don't need to be builtins, it makes sense; let other programs call them with exec. For `cd` it doesn't make much sense though, and I'm not sure why it exists. Is it just for consistency with other builtins, or does it serve a real purpose? IDK.

(edit: the short answer is "POSIX says so" https://github.com/freebsd/freebsd/commit/55d0b8395514ae4055... , but why does POSIX say so? See my child comment for further citation.)

2. The second mistake is about what `tr` is doing. You claimed it's converting lowercase to uppercase; but that's backward, it's converting uppercase to lowercase.

So, why does it convert to lowercase? Recall that we learned that it's the same script being used for all builtins. If it weren't literally the same file (at the cost of a few more bytes disk space), it could have just done a search/replace within a template, having each be `builtin BUILTIN_NAME ${1+"$@"}`. But they wanted to save a few bytes, and instead the script must detect the appropriate builtin name by translating its program path to a builtin name. If you execvp("cd", ...), it will invoke the script with $0 set to "/usr/bin/cd". If /usr/bin is on a case-insensitive filesystem, and you execvp("CD", ...), that will also call the script, with $0 set to "/usr/bin/CD". How is it going to translate from "/usr/bin/CD" to "cd"? The ##*/ bit trims the leading directories, then the tr bit converts the remainder to lower case.

(as an aside: the `${1+"$@"}` is a little interesting too; why not just write `"$@"`? "$@" will expand to the full list of arguments (after argv[0]). The ${1+...} bit says to only do that expansion if the first argument exists (i.e., there are >= 1 arguments). But that should basically be happening anyway; if there are no arguments, "$@" should expand to a zero-length list. IDK, perhaps a weird historical shell?)

3 comments

https://www.gnu.org/savannah-checkouts/gnu/autoconf/manual/a...

One of the most famous shell-portability issues is related to "$@". When there are no positional arguments, Posix says that "$@" is supposed to be equivalent to nothing, but the original Unix version 7 Bourne shell treated it as equivalent to "" instead, and this behavior survives in later implementations like Digital Unix 5.0.

The traditional way to work around this portability problem is to use ${1+"$@"}.

Thanks for digging that up! But for scripts written for FreeBSD in 2002 [1], I have to wonder which non-POSIX-y shells they were worried about.

[1]: https://github.com/freebsd/freebsd/commit/55d0b8395514ae4055...

Also some shells will refuse to run a builtin if there is no executable in the path that matches it.

POSIX actually mandates this behavior for any builtin not on a specific list, though most shells (even dash, which is typically obsessive about complying with POSIX) do not implement this behavior.

That didn't sound right to me, but I looked it up, and you're right.

The scripts we're discussing were written in 2002 [1], so the time-appropriate version is POSIX-2001 (Issue 6).

[1]: https://github.com/freebsd/freebsd/commit/55d0b8395514ae4055...

But you don't have to take my word for it: I've paste-bin'ed the entirety of what POSIX-2001 had to say about shell built-ins (in general, I didn't include man-pages for individual built-ins): https://lukeshu.com/dump/posix-2001-builtins.txt

As for whether that's still true today, looking at POSIX-2008 (Issue 7), 2013 edition (I don't have a copy of the 2016 edition handy), none of that has changed.

I only discovered this because I implemented a shell specifically by the specification (just for didactic purposes). I was unable to find a modern shell that acted this way, even with passing the "be more POSIXy" options though.
> If /usr/bin is on a case-insensitive filesystem, and you execvp("CD", ...)

That seems like fairly rare edge case, considering that UNIX typically had case-sensitive file-systems, or rather handled filenames as opaque blobs. I wonder what actual system caused the need for case folding? Maybe HFS?

The `tr` bit isn't actually in the FreeBSD version of the script[1], so I'm assuming it's a macOS HFS thing.

https://github.com/freebsd/freebsd/blob/master/usr.bin/alias...

Mac filesystems case-fold by default, so the script could be invoked as cd, CD, Cd, or CD. Only the lowercase cd is shadowed by the builtin.