I've seen this same argument time and time again and it's just silly. We preach that curl | sh is evil because of a potential lack of "transparency" but rarely does anyone denounce the evils of binary packages.
When you run third-party code on your system there is ALWAYS a risk of it doing nasty things, it doesn't matter if it's an easily readable bash script or a .deb you downloaded. The biggest argument I see about curl | sh that I can agree with is the issues that happens when your connection dies in the middle of the download. Just download the file, then run it.
Yeah, don't download and run binaries from random links on the internet either.
I don't think anyone is against recommendations of running "curl trustedsite.com/install | sh" except for the bad habits it teaches to people who don't know what curl and sh actually do, but wouldn't download and run a random exe.
Help forums are rife with suggestions to run "curl http://pastebin.com/raw.php?i=XXXX | sh" to solve technical problems. And not fringe forums either, but forums like the official Ubuntu forums.
I've literally never heard anyone "preach that curl | sh is evil" who wouldn't/isn't saying the exact same thing about binaries.
I wrote this article a year ago, and you hit the nail on the head - that is what I was getting at.
I didn't submit it here because it wasn't really meant for an advanced crowd, obviously most people would be aware of the dangers.
For example, take a look around at the pirating world. Many sites that help you install projects such as Couchpotato, Sickbeard, Sabnzbd etc rely on people curl-piping bash scripts. The people installing from those scripts likely do not know any better.
It's a little worse because with curl | sh - you inhernetly aren't able to check a md5 hash or a signature to verify the file is actually what you wanted.
Now while even with binaries people might now actually do that often enough - it at least is still an option.
Truth is I'd rather read someone else's shell script than someone else's C, python, ruby, javascript or other code. Not to say it still isn't painful reading; most scripts I see are nonsensically verbose. But it is a much less time-consuming read.
Unless, of course, it is written in a shell like Bash, i.e., one with too many extra features to keep track of. Like, say, exporting of functions, for example.
We preach that curl | sh is evil because of a potential lack of
transparency but rarely does anyone denounce the evils of binary
packages.
This is "Freedom 1"[1] and a bedrock principle of the FSF.[2] Applebaum recently gave a talk "Free Software for Freedom Surveilance and You" about the evils of binary packages.[3]
There are other risks besides malicious webservers. Even an accidental network glitch can be fatal, for example if the connection is dropped after the first "/" here:
curl won't buffer the entire stream, since that would be silly, so if it is a big enough response then curl will have already passed parts of it along to the shell through the pipe.
Is this likely to cause a catastrophic failure? No. Is it possible? Absolutely.
When curl gets the end of the input, it exits and the shell closes the pipe generating an EOF. You get a broken pipe writing to one if the reader goes away, not reading from it.
I verified this on OS X with the below "server":
$ stty -icanon min 1 time 0
$ nc -v -l 6666
GET / HTTP/1.1
User-Agent: curl/7.34.0
Host: 127.0.0.1:6666
Accept: */*
HTTP/1.0 200 ok
Content-Length: 1000
echo foo
echo bar^C
(^C is a control-C). Even with an explicit content-length so curl knows the response was truncated, and without a terminating newline, the shell executes both commands.
Again, quite unlikely to be a problem in real life. But it is still a bad habit to feed curl into sh directly.
I don't understand why you would go through all of this effort...
Just dump the data into a file:
curl > foobar
Read the file using any number of normal utilities
vim foobar
cat foobar
nano foobar
less foobar
Then if you like what you see execute the file
sh foobar
Linux/Unix utilities are meant to be used. Don't limit yourself to only knowing how to check the contents of a curl install if you have a curlsh function.
As programmers, our entire job revolves around removing unnecessary processes that can be automated. So I ask you, how does providing a shell function which does exactly what you just suggested limit somebody?
If they can't read it, they likely wouldn't even know how to install it.
Disclaimer: I wrote this article in Aug of last year.
It takes up precious mind space for an ad-hoc one-off feature, instead of utilizing simple well-established unix commands commonly available on servers and usable for multiple purposes. Plus I'd say it gives a false sense of security because a moderately determined attacker can easily obfuscate his exploit so as to slip through this casual review process.
The mind space argument breaks down because its generally an install process which means you are just typing what they wrote in their README.md file if it seems reasonable. Its not something you memorize.
I think the "mind space" is knowing how to handle data, run an editor, and chmod or run a shell with inputs (all useful, portable skills), versus having a limited-use hand-holding script.
I'd say apply the same level of scrutiny as you would other code, such as the code that your distribution allows you to install. That means:
1) Find a source you trust (nominally)
2) Get a gpg-key that you trust belong to that user
3) Get the install.sh script
4) Get the matching gpg signature (install.sh.asc)
5) Verify that 4) is a valid signature of 3) under 2)
6) Have a look at the script
7) Run the script
If you can't establish 2), you'll just have to stick to 3) 6) and 7).
Seeing that something is on a https site, just means someone had the access to put it there. If someone got access to the private key behind 2) -- 1) is probably so compromised that there isn't anything other than 6) that might protect you -- and if the script is truly malicious (as opposed to just your average botched bash script) -- it's not guaranteed that it's obviously malicious.
Anyway, a gpg signature links some distributable the author has verified all the way back to wherever that file was authored -- while https only anchors trust on the web server. Web servers get compromised all the time. Prefer a proper signature as a means to anchor trust ("yes, this is probably what X wanted to distribute. If you trust X, this is probably OK").
A https signature just means: "This is something someone/anyone managed to upload to this web server".
If you don't want to bother with all this you can also just do a simple `wget http://site.com/file.sh -O /tmp/script`, look through it in your editor, then run it.
Is this really a "hidden" danger? It's pretty obvious that you shouldn't execute a script without reading it unless it's from a trusted source over https.
I don't see much reason for ever needing to do this. Should build packages to install software, and use config management for anything needed outside the package.
When you run third-party code on your system there is ALWAYS a risk of it doing nasty things, it doesn't matter if it's an easily readable bash script or a .deb you downloaded. The biggest argument I see about curl | sh that I can agree with is the issues that happens when your connection dies in the middle of the download. Just download the file, then run it.