Hacker News new | ask | show | jobs
by forrestthewoods 698 days ago
Hrmmm. But why?

Quite frankly I think Bash scripting is awful and frequently wish shell scripts were written in a real and debuggable language. For anything non-trivial that is.

I feel like I’d rather write C and compile it with Cosmopolitan C to give me a cross-platform binary than this.

Neat project. Definitely clever. But it’s headed in the opposite direction from what I’d prefer...

4 comments

Master Foo once said to a visiting programmer: “There is more Unix-nature in one line of shell script than there is in ten thousand lines of C.”

The programmer, who was very proud of his mastery of C, said: “How can this be? C is the language in which the very kernel of Unix is implemented!”

Master Foo replied: “That is so. Nevertheless, there is more Unix-nature in one line of shell script than there is in ten thousand lines of C.”

The programmer grew distressed. “But through the C language we experience the enlightenment of the Patriarch Ritchie! We become as one with the operating system and the machine, reaping matchless performance!”

Master Foo replied: “All that you say is true. But there is still more Unix-nature in one line of shell script than there is in ten thousand lines of C.”

The programmer scoffed at Master Foo and rose to depart. But Master Foo nodded to his student Nubi, who wrote a line of shell script on a nearby whiteboard, and said: “Master programmer, consider this pipeline. Implemented in pure C, would it not span ten thousand lines?”

The programmer muttered through his beard, contemplating what Nubi had written. Finally he agreed that it was so.

“And how many hours would you require to implement and debug that C program?” asked Nubi.

“Many,” admitted the visiting programmer. “But only a fool would spend the time to do that when so many more worthy tasks await him.”

“And who better understands the Unix-nature?” Master Foo asked. “Is it he who writes the ten thousand lines, or he who, perceiving the emptiness of the task, gains merit by not coding?”

Upon hearing this, the programmer was enlightened.

This koan shows the power of a one-liner, not shell scripting in general. Both Master Foo and Nubi would agree that a string/array manipulating function in bash isn’t worth their time when python exists.
I was going to cite this on reading the parent comment after reading it. Was very glad to see you beat me to it!
And then the programmer had to debug a hundred line shell script and they realized it should have all been written in Python or Rust instead.

Master Foo is shorthand for Fool.

Shell is just one way. There’s nothing that says we can’t do better than shell, but what it’s good at is saving programmer time when the need isn’t there for more, and Rust is definitely not good at that.
My rule of thumb:

    Shell: <= 5 lines
    Python: <= 500 lines
    Rust: > 500 lines
Although to be honest I'd be perfectly happy if Shell was restricted to single line commands only.

I've wasted a lot of time and energy deciphering undebuggable shell scripts that were written to "save programmer time". Not a fan.

My rule (and the code review policy I impose) emphasizes complexity instead - a 50 line shell script is great if it doesn't use if or case. (It's not so much of a strict rule as "once you're nesting conditionals, or using any shell construct that really needs a comment to explain the shell and not your code, you should probably already have switched to python." This is in parallel with "error handling in this case is critical, do you really think your bash is accurate enough?")

I wasn't the strictest reviewer (most feared, sure, but not strictest) at least partly because my personal line for "oh that bit of shell is obvious" is way too high.

Nothing is as obvious as it could be when it’s 3am and you’re debugging a production outage. :)
This rule of thumb is clearly too simplified, even as far as the definition goes.

Sometimes you just want to execute 50 lines with little logic.

Sometimes you just have some simple logic that needs to be repeated.

Sometimes that logic is complicated, sometimes it is not.

Sometimes someone writes 50 lines of simple logic. And then sometimes someone else needs to figure out why it’s not working. That person gets very cranky and wastes a lot of time when those “simple” 50 lines aren’t debuggable.

If shell scripting didn’t exist I would be totally fine with that. There are far more scripts that I wish were written in a real language than the other way around.

Master Foo long predates Python and Rust.
Masters live to be surpassed by their students. Just because something was best in class in the 80s doesn't mean it should still be used.
Very true, but also student hubris is legendary. Which is perfectly fine, as we all know successful students.

But let's not blind ourselves with the survivor bias. Not everything new and very bright will succeed the test of time.

So let's take evrything with a grain of salt, and wait until the time has choosen its champions. Which might not be the best technology as we learned

I don't know about the specific motivations for this project, but if you're curious about why work like this might have serious real-world relevance beyond scratching an itch, idle exploration, or meeting a research paper quota, you can look to similar work and literature:

GNU Mes: https://www.gnu.org/software/mes/

Stage0: https://bootstrapping.miraheze.org/wiki/Stage0

Ribbit (same authors): https://github.com/udem-dlteam/ribbit

stage0-posix: https://github.com/oriansj/stage0-posix

Bootstrappable Builds: https://bootstrappable.org/

See also this LWN article about bootstrappable and reproducible builds: https://lwn.net/Articles/841797/ It contains a plethora of interesting links.

I'm not the OP, but I think the goal is to make it cross architecture. Cross platform C compiler would give you cross OS compatibility, but chip architecture would still be fixed, I think.

I.e., you can take your compiled.sh and run in an obscure processor with an obscure OS, as long as it's POSIX, it should work...

I believe the goal is to defeat the compiler trust thought exercise where a malicious compiler could replicate itself when being asked to compile the compiler. Since this produces human readable code instead of assembly, the idea is it allows bootstrapping a trusted compiler, since pnut.sh and any output shell executables are directly auditable.

I suppose the trust moves to the shell executable then, but at least you could run the bootstrapping with multiple shells and expect identical output.

That's the idea!

As you point out, it moves the trust from the binary to the shell executable, but the shell is already a key piece of any build process and requires a minimum level of trust. The technique of bootstrapping on multiple shells and comparing the outputs is known as Double Diverse Compiling[0] and we think POSIX shell is particularly suited for this use case since it has so many implementations from different and likely independent sources.

The age and stability of the POSIX shell standard also play in our favor. Old shell binaries should be able bootstrap Pnut, and those binaries may be less likely to be compromised as the trusting trust attack was less known at that time, akin to low-background steel[1] that was made before nuclear bombs contaminated the atmosphere and steel produced after that time.

0: https://dwheeler.com/trusting-trust/ 1: https://en.wikipedia.org/wiki/Low-background_steel

> Hrmmm. But why?

because Bash goes brrrr