| First of all, thank you for presenting a succinct take on this viewpoint from the other side of the fence from where I am at. So how can I learn from this? (Asking very aggressively, especially for Internet writing, to make the contrast unmistakable. And contrast helps with perceiving differences and mistakes.) (You also don’t owe me any of your time or mental bandwidth, whatsoever.) So here goes: Question 1: How come "speed", "performance", race conditions and st_ino keep getting brought up? Speed (latency), physically writing things out to storage (sequentially, atomically (ACID), all of HDD NVME SSD ODD FDD tape, "haskell monad", event horizons, finite speed of light and information, whatever) as well as race conditions all seem to boil down to the same thing. For reliable systems like accounting the path seems to be ACID or the highway. And "unreliable" systems forget fast enough that computers don’t seem to really make a difference there. Question 2: Does throughput really matter more than latency in everyday application? Question 3 (explanation first, this time): The focus on inode numbers is at least understandable with regards to the history of C and unix-like operating systems and GNU coreutils. What about this basic example? Just make a USB thumb drive "work" for storing files (ignoring nand flash decay and USB). Without getting tripped up in libc IO buffering, fflush, kernel buffering (Hurd if you prefer it over Linux or FreeBSD), more than one application running on a multi-core and/or time-sliced system (to really weed out single-core CPUs running only a single user-land binary with blocking IO). |
Here's a related example of what happens when you change a shell primitive's behavior - even interactively. Back in the 2000s, Linux distributions started adding color output to the ls command via a default "alias ls=/bin/ls --color=auto". You know: make directories blue, symlinks cyan, executables purple; that kind of thing. Somebody thought it would be a nice user experience upgrade.
I was working at a NAS (NFS remote box) vendor in tech support. We frequently got calls from folks who had just switched to Linux from Solaris, or had just moved their home directories from local disk to NFS. They would complain that listing a directory with a lot of files would hang. If it came back at all, it would be in minutes or hours! The fix? "unalias ls". Because calling "/bin/ls" would execute a single READDIR (the NFS RPC), which was 1 round-trip to the server and only a few network packets; but calling "/bin/ls --color=auto" would add a STAT call for every single file in the directory to figure out what color it should be - sequentially, one-by-one, confirming the success of each before the next iteration. If you had 30,000 files with a round-trip time of 1ms that's 30 seconds. If you had millions...well, either you waited for hours or you power-cycled the box. (This was eventually fixed with NFSv3's READDIRPLUS.)
Now I'm sure whomever changed that alias did not intend it, but they caused thousands of people thousands of hours of lost productivity. I was just one guy in one org's tech support group, and I saw at least a dozen such cases, not all of which were lucky enough to land in the queue of somebody who'd already seen the problem.
So I really appreciate GNU coreutils' commitment to sane behavior even at the edges. If you do systems work long enough, you will ride those edges, and a tool which stays steady in your hand - or script - is invaluable.