Hacker News new | ask | show | jobs
The Unix `Who` Command (gauthier.uk)
96 points by warenhor 2119 days ago
11 comments

It's funny to me, because the way I came up in Unix, `utmp` was one of the first things you learned about: you were using Unix not because you installed it but because every system you might want to break into (for instance, to get to IRC through an outdial or a gateway) was running it, and they were all multi-user shell machines, and you wanted to make sure you weren't showing up in `who`. I probably knew the `utmp` format before I knew how to use `find`.

You'll find, in archives from hacking sites of the era, a whole variety of "utmp editors".

Thanks to setlogin()/getlogin() some operating systems are frustratingly close to the point where all of the relevant information can be read straight out of the kernel's process table, without need for coöperatively maintaining a data file.

On FreeBSD, for example, one can do everything except filter out the terminals where no-one is logged on yet and print a "FROM" column.

Understanding UNIX/LINUX Programming: A Guide to Theory and Practice by Bruce Molay is a work which has the reader learn system programming by re-implementing UNIX commands.

who is the focus of Chapter 2.

How useful is that book nowadays? Is it still worth purchasing?
Yeah wondering if this would be interesting to do as a recreation project. Is copyright 2003 and mostly unvailable -- Amazon dares to "rent" it to me for 45$.... give me a break.
I was messing with it a bit several months ago. You'd have to look up newer data structures, but just looking up the man pages gave me the necessary information. While the implementation may be different now, the fundamentals of UNIX are pretty much the same.
I was confused - is "who mom hates" supposed to do something interesting?

On my Mac it seems any two arguments are allowed as an alias for "am i", so long as the first doesn't start with "-".

Looking it up, https://unix.stackexchange.com/questions/108145/is-who-mom-l... suggests it's a well-known in-joke.

FWIW, on my FreeBSD machine, only "who am i"/"who am I" are allowed.

Any two non-option arguments returns the user of the invoking tty. "who is awesome" or "who foo bar" achieves the same result.
I was confused because when I'm in tmux `who am i` (or `who mom hates` or `who -m`) prints nothing. I thought that might've been the joke -- I'm nobody (or something). When I detach from tmux and run it in the login shell directly it works as expected. The phrasing in the man page ("'am i' or 'mom likes' are usual.") is also pretty strange.
I love your simple demonstration of strace reverse engineering a UNIX command. Probably a very silly question but I get this compile error? mywho.c:8:13: error: use of undeclared identifier 'LC_ALL' setlocale(LC_ALL, ""); I should solve it myself but since you are here?
Not the author as I had the same issue, add `#include <locale.h>` after the other headers.
Thanks for the help! It works!
I was hoping to see why who was better than w or similar, instead I was pleasently surprised to find someone taking a really simple unix command, reverse engineering it, and reimplementing it
The comment "initialize the utmp structure" is wrong. You have declared the struct here but it is uninitialized.

And there is no point in setting the locale, since your program never uses it.

I was legitimately confused for a second when I saw `Who` capitalized, but it must have been some automatic HN formatting - it's lower-case is the linked blog post.
who is 66 % waste: w is 300 % shorter and shows up to 721 % more valuable information.
who shows where the session is coming from which is useful for me where random people connect with the same system account.

(OK w -f shows this)

There's something delightfully absurd about invoking who and getting told where they're logging in from. There's an Abbot and Costello skit in there waiting to come out.

This is why I love computing. It's putting abject insanity to good use.

Nice article. Reading the ‚who(1)‘ manpage would also have revealed ‚utmp‘ to be the relevant file, of course.
I’ve found grep to be unreliable or behave unexpectedly with strace. Even when piping to strings first, although not sure if that is of any consequence. Anyone know why that is or experienced the same?
You know it outputs to stderr, right? Also if you give it the option "-o -" it doesn't do what you might expect. Other than that I grep the output of strace all the time and it works for me.
His "who" implementation is 73 lines (21 if you remove blank lines and comments). Compare that to 836 in GNU Coreutils[0].

[0] https://github.com/coreutils/coreutils/blob/master/src/who.c

plan 9 who is a 3 line shell script that just greps the output of ps and sorts it. You could eliminate ps and just grep through /proc but why reinvent the wheel when an existing tool already does part of the job?

I see some comments here citing a lack of options, most of which appear to have nothing to do with who is logged into the machine.

Source for the mentioned script: https://github.com/0intro/plan9/blob/master/rc/bin/who. Also, other utilities implemented as scripts are fshalt (P9's shutdown), kill, lookman (P9's apropos), uptime and whois.
Does anybody know why there are line breaks in the sed command?
It runs multiple expressions. That is

  sed '/pattern1/d
  pattern2/d'
is the same with

  sed '/pattern1/d; /pattern2/d'
and

  sed -e '/pattern1/d' -e '/pattern2/d'
I believe those are portable across all `sed` implementations, and they at least work with GNU `sed --posix` and plan9port's sed.
Thank you! This is exactly the kind of reply I was hoping for.
Is /proc as portable as `ps`?
There's a long explanation when it comes to ps, but the very short answer is twofold:

* Don't parse the output of ps, unless it is my ps command (or plan 9's one, as here). See https://unix.stackexchange.com/a/593198/5132 and https://unix.stackexchange.com/a/578816/5132 , and their further reading, especially Greg Wooledge's article.

* /proc is as portable as ps, but that doesn't really amount to much because neither is really portable at all. Pretty much every operating system's /proc is different; and no implementation of the ps command fully conforms to even the limited subset laid out in the Single Unix Specification.

I guess that Linux `/proc` is different from Plan 9 `/proc`, and Linux `ps` is different from Plan 9 `ps`.

So maybe yes.

openbsd 'who' is in the 300 range:

http://cvsweb.openbsd.org/src/usr.bin/who/who.c?rev=1.29&con...

this version appears to derive from rewriting unix who in the 4.4BSD era to replace AT&T code (copyright 1989 / no AT&T in header notes as basis for assumption)

Its interesting. FreeBSD's is about the same size, but was again re-written in 2002 to "to add some features required by SUSv3" : https://github.com/freebsd/freebsd/commit/1894db5ac7af64acc7...
V7 `who` was 62 lines; by V10 it was bloated up to 93.
> Of course, this only mocks the most basic features of the who command and doesn’t handle any option, like the famous `who am i` or `who mom hates`.

it's not quite an apples to apples comparison as his rewrite doesn't handle any options or cleverness, but it's still a massive jump in LOC.

I know that it isn't a full implementation and my comment wasn't meant as a real comparison.
Nor is it anywhere near as portability. Though that's not a big issue these days.
Wow, I touched a nerve I don't understand, to get a -1, so I'll explain myself. The GNU code looks like it's portable to a bunch of OSes, some with and some without a given feature. That sort of portability leads to having a lot of #ifdefs. If you're only going to support a handful of OSes - because that's all most people care about - then of course your code is going to be smaller.

As a concrete example, the linked-to "who" uses:

    if (entry.ut_type != USER_PROCESS)
      continue;
The GNU one uses:

      if (IS_USER_PROCESS (utmp_buf))
The IS_USER_PROCESS() macro is in gnulib's readutmp.h (which would need to be included in the overall line cound):

   # define IS_USER_PROCESS(U)                                     \
      (UT_USER (U)[0]                                              \
       && (UT_TYPE_USER_PROCESS (U)                                \
           || (UT_TYPE_NOT_DEFINED && UT_TIME_MEMBER (U) != 0)))
The UT_USER maps to ut_user on some systems, and ut_name on others:

   /* Accessor macro for the member named ut_user or ut_name.  */
   # if HAVE_UTMPX_H
   
   #  if HAVE_STRUCT_UTMPX_UT_USER
   #   define UT_USER(Utmp) ((Utmp)->ut_user)
   #  endif
   #  if HAVE_STRUCT_UTMPX_UT_NAME
   #   undef UT_USER
   #   define UT_USER(Utmp) ((Utmp)->ut_name)
   #  endif
This is because, as https://www.gnu.org/software/libc/manual/html_node/Logging-I... points out, "Note that the ut_user member of struct utmp is called ut_name in BSD. Therefore, ut_name is defined as an alias for ut_user in utmp.h"

Each of those other macros supports cases where the machine supports a given test, and when it doesn't. Including for machines which don't implement ut_type! (That's what the UT_TYPE_NOT_DEFINED case is for, which falls back to using the time field ... which has its own set of variants!)

Portability is neither options nor cleverness, so I added it to the list of considerations. But then again, most people now don't need to support decades of history and scores of Unix variants.

If you don't care about all the features then Busybox's implementation will work fine at 170 lines inclusive of comments.