Hacker News new | ask | show | jobs
by rnijveld 2021 days ago
I wonder why there is such a focus on this `curl|bash` pattern. Meanwhile most of us are downloading hundreds of thousands of lines of code using all kinds of package managers and I don't see many inspecting all those downloaded files, especially not manually. I don't think anyone would ever get to doing anything other than checking if you really want to verify everything.

I'm not saying that downloading something from your official OS package repositories is the same as downloading a random URL from the internet. The thing I'm more thinking about is language specific package managers such as NPM, Composer and Cargo. Or user repositories, things like AUR, PPAs and non-official apt repositories, where any random person can put something up. The thing for those is that they almost look like they are something official and something to be trusted. Often times they are displayed on an official site, you download them from a trusted URL and they look like they are really secure, even with hashing and things like that built-in. Lots of package managers don't support any way of verifying the identity of the one uploading the files, and even if they do we often import signing keys into our chain of trust without a moment of thought or we don't use the signing mechanism at all.

And with something like NPM packages you are likely to download another few dozen of other packages which you didn't even intent on downloading. You will probably run a lot of code there that could be doing all kinds of horrible things.

At least with `curl|bash` I get some feedback of where the code is originating from, what URL will I be downloading something from and is that some place that I can trust. At least I get somewhat of an identity verification (albeit very very weak) as long as I trust the owner of the site to protect it adequately from preventing unauthorized uploads.

13 comments

I would put the language-specific package managers in the same category as curl|bash because anyone can push code without anyone else checking it, but there is a real difference with your distribution: it acts as an independent third-party. In that sense they act somewhat similarly to Certification Authorities, in that I as a user will not blindly trust a self-signed certificate but will trust a certificate that was vetted by this third-party.

In practice when you install something from AUR with a helper, it's not that far from doing a curl|bash (except the helpers will nag you to inspect the content, but allow you to skip doing it by default). The difference is who you curl it from.

Edit: as a precision, I do differentiate official repos and "third-party" repos; the latter are definitely a more integrated curl|bash, the same precautions apply

It's more complicated than that.

curl http | bash - you're basically throwing caution to the wind as everyone between you and the server can rewrite the request, meaning a fairly large number of people who serve you something malicious.

curl https | bash - you're putting your trust in the server, and the PKI / CA infrastructure. A small amount of people can hurt you. If we park the PKI argument, only the owner of the server can attack you. The problem here is, the server owner can specifically detect your behaviour, and take advantage of your trust.

language specific - _generally_ you're pulling a hash from a repository that's public and many people can and do audit. You can't be spot served a slightly different, malicious version of a file, without it first being published for others to see. This means the vendor risks their reputation with this kind of attack, and you're likely to find out about it at some point down the line.

Obviously reality is slightly more complicated, but if your language package manager is relatively modern, pulls and checks via hash, and offers up a .lock file functionality, then it's quite a bit different from a curl http(s) | bash.

For starters I think we can agree that no-TLS is just out of the question in any case.

You are right that there is a difference, but to me the real threat model is different: in your comparison you assume that the original author is legit but the vendor can be malicious. I believe it's more accurate to assume the original author is malicious. In that case:

curl https | bash - you are compromised

language specific - the malicious author's content isn't checked before it is being pushed. They have a window of opportunity before being discovered by the community and be banned, but the hashes don't protect: the verification must be done manually

third-party repos - I'll only take the example of AUR because it's the one I know best: if the malicious author is also the packager, the situation is the same as the previous one. But, as is often the case, if the malicious author is not the packager, the latter has to be convinced to serve bad content and acts as a simple gateway

Agreed yeah.

There's a whole bunch of complexity that goes into whether or not your should trust an entity.

In general though, my opinion, if you use reasonably popular and thus regularly audited packages, you have protective monitoring, and a defense-in-depth framework, there is obviously still a risk of you being first to pick up a bad commit, but you can mitigate those relatively well.

Front end has different considerations. I believe you can defend against the magecarts of the world with CSP but it's not my own forte.

The big thing is, of course, if you're not willing to do your part scanning, reviewing and auditing, nobody else will. Tragedy of the commons and all that.

>it acts as an independent third-party. In that sense they act somewhat similarly to Certification Authorities, in that I as a user will not blindly trust a self-signed certificate but will trust a certificate that was vetted by this third-party.

I have no idea why anybody trusts CA's in the first place. People seem to imagine that there's some gate in play where Mr. D. Badguy doesn't get certs signed by Verisign. He absolutely does.

This has been an issue that "Web of Trust" doesn't really do anything to solve, and the delegation of worrying about this crypto non-sense going to Admins instead of users themselves just kicks the can down the road. Random code on the net is exactly like buying a blackbox in a Bazaar somewhere, If you don't have the skills to run/vet/sandbox it safely, no amount of Web of Trust nonsense will save you from it.

All it does is piss off users, devs, and admins alike when something goes wrong with certs, and gives a centralized authority a lever to pull to screw with you. Another brick in the monopolistic wall.

> All it does is piss off users, devs, and admins alike when something goes wrong with certs, and gives a centralized authority a lever to pull to screw with you. Another brick in the monopolistic wall.

Oh, c'mon. Bad certs do get issued, but it's rare. And blindly trusting an attestation from DigiCert that you're talking to Amazon.com is a whole lot better than most ways you'd check.

And then pinning, in turn, makes things a lot more resistant to many of the attack scenarios that remain, for users who visit you multiple times.

It's because security is not an on/off switch, it's a sliding rule. The further you push it, the less convenient it is. No one ever said Verisign as a CA is a perfect system; it's just better than assuming the server's certificate is legit. It reduces the risk, it doesn't remove it.

At some point you want to use the Service/see the content. As you said, you can't vet the whole stack from top to bottom, there is not enough time in a life for that. You have to start trusting someone, somewhere

Exactly this. There is precisely one entity[0] who can legitimately certify a particular public key as who `example.com` belongs to, and that is whichever entity controls the (definite article, globally unique) DNS servers for `com`, exclusively in a capacity not detectably distinct from the rest of the process of registering `example.com` as a resolvable domain name.

0: Mumble mumble namecoin, mumble mumble not technically a entity, but that's not particularly relevant for most cases.

I personally don't like `curl | bash` because I don't know _how_ something will install: 1) What are all of the directories that something will insert itself into? 2) What of my files (.bashrc, etc.) will it modify? 3) If it modifies those things, will it tell me?

The `curl | bash` install pattern means that it can do _anything_.

Using a package manager I know that the install will be "typical", and easy to uninstall (that's the case with most of the package managers that I use anyways). Each package manager has a different pattern, sure, but at least it will be predictable.

This isn't how I'd describe the guarantees provided by a package manager. In fact, most package managers don't really provide any guarantees at all; almost all of them support something like preinst.sh and postinst.sh scripts which can basically do anything. It's the package maintainers that are supposed to provide the guarantees you describe. Of course, they're only human, and their incentives might not line up with yours.

And if you stray outside the official channels, as most users must at least some of the time, then you're back to all-bets-are-off. Fetching and installing packages from a channel hosted by some third party really is no better from a security standpoint than running a (signed) shell script from that same party.

EDIT: I should add that there may be some new, advanced package management systems that do actually provide strong guarantees, like only putting files in certain directories, never setting the setuid/setgid bits on executable files, or perhaps ensuring that all files from a package are owned by a user:group associated with that package (the Linux From Scratch docs describe a package management scheme like that, it's worth checking out). I'm referring here to the majority of popular package managers, e.g. dpkg, which will run arbitrary code during installation.

You make some good points, but I want to follow up:

With dpkg packages for example, you do get a few guarantees.

1. The package will include a list of files which it installs 2. The package manager will not overwrite existing files which were installed by dpkg without an explicit diversion 3. When uninstalling, the package manager will remove any of those files, and the directories created for them (unless they are not empty or are also crated by another package) 4. It won't run as non-root (unless you've made some major changes to your system), and as such won't prompt you for, or try to take advantage of, sudo access.

Sure, that doesn't stop out-of-the-norm behaviour; the Oracle Java packages are a great example of this; the packages contain only a shell script which downloads, unpacks, copies, and symlinks the actual Oracle Java tarball from Oracle's website, and then (ideally) removes those packages if you're uninstalling. Still, it's far more of guarantees than curl|bash provides.

I don't think the guarantees you've numbered 1, 2, or 3 are true. Insofar as the package uses the standard mechanism for installing files, sure, it can guarantee that. But I don't believe it hooks a tracer up to the installer script to detect the betrayal of those guarantees. I think it just runs the install script, as root, trusting that the files list and uninstall scripts will do their job. The whole thing is based on implicit trust of the package maintainer, not guarantees in software.
You're right, and I've called that out in my post as well (re: Oracle Java, as an example).

That said, I've got far more trust in someone who's gone to the trouble of making a .deb file than someone who put a shell script on GitHub.

This is not exactly a fair comparison because it is documented and configurable but I've recently found out apt on its default settings does something unexpected (for me at least) when removing packages (purge + autoremove): Normally you (I) would expect all automatically installed dependencies (depends/recommends/suggests) to be gone after this, if no other package references them in its depends/recommends lists (which is what gets installed on the default settings).

However it turns out if a package suggests another package and that other package somehow gets installed, the suggested package will not be autoremoved anymore because autoremove honors suggests relationships as a reason for not removing automatically installed packages. While there are valid reasons for this (e.g. when installing something with --install-suggests) it also amounts to a lot of unwanted packages after a while of installing/uninstalling software. I don't know if this has an widespread name but I call it "suggestion congestion" for that.

Of course, one can turn this off by setting APT::AutoRemove::SuggestsImportant to "false". And really, that is an awful problem to solve since you have to deal with different users and package maintainers with different expectations. And apt still solves a lot more problems that it creates.

But I'm now convinced that there is no such thing as a clean uninstall. At least not until the year of the stateless ZFS snapshot rollback NixOS desktop.

You can also pass --no-install-recommends to apt for a one-off installation to avoid pulling in a ton of garbage from a specific installation.
Meanwhile, in Windows-land, literally everything is installed by clicking Setup.exe.
Not quite. Corporate machines will most likely have some kind of management like SCCM, and there are options like PatchMyPC or Ninite for home users.

There's also Chocolatey and OneGet or whatever it's called today, plus vcpkg and nuget over in developer land.

Chocolatey and OneGet packages are usually wrappers around setup.exe/setup.msi with commandline arguments to keep the installer quiet, nothing more.
> I wonder why there is such a focus on this `curl|bash` pattern.

Because it's easy to understand, it's a cheap way to look smart on the Internet by bashing people. Also on a lot of servers people might only run ditro packaged packages. More eyes has gone though them so ops people would bash someone for curl | bash on their servers while it's perfectly "acceptable" on client machines.

I have put several curl|bash things into production and it's done because it's the only way to run installers that work on everything but windows without having to maintain a .deb, .rpm, and brew formula or something.

Often I'll write something like: here's the install steps, or just enter this curl|bash line into your terminal. Guess which users prefer.

People who care can download the script first and/or run it as a different user or in a vm. It's not that scary.

i like it, can you make them available ?
This is the answer. It’s the “never use inline styles” of ops: A rule that was once taught for good reasons, and is easy for people who know little else to call out and enforce. Never mind that times have changed and the reasoning that caused people to create these rules in the first place no longer makes sense.
Agreed.

The truth is: We're downloading and executing code from the internet all the time and the amount of trust we can put into this is very fragile. Some risks can be mitigated by installing stuff in containers if you don't need them to interact with the rest of your system. It's conceivable that the whole situation could be improved by a combination of reproducible build and packaging processes, transparency logs etc., but none of that exists today in any way that would provide a reasonable level of protection.

Right now the curl|bash-pattern isn't any more problematic than downloading an installer from a random page and doing chmod +x;./install.sh or using a package manager installing an unreasonable amount of dependencies.

> Meanwhile most of us are downloading hundreds of thousands of lines of code using all kinds of package managers.

Depending of course on the Package Manager, but traditionally those are signed, usually by people who actually do inspect the code. (I used to maintain Fedora RPMs, we audited code before putting our signature on it)

curl|bash allow personalized attacks... If for example you have an IP address from a certain company. (if you have access to ad targeting data you can refine a lot further - just remember web site visits from an IP and match them to IP from curl command)

repos are mirrored, come with signing keys and any successful attacks are detected sooner or later and become public knowledge.

I wasn't arguing that official distro repositories are unsafe, I was actually saying user provided repos are almost as bad (or even worse in some ways, given that give the feeling of being way more secure) as `curl|bash`. Even if they are signed (such as AUR and PPA) most people will blindly add signing keys for people or organisations they do not know, giving them the feeling that they have secured themselves, but have they really?

I guess detecting attacks is easier if all files have to be uploaded to a central service, which does allow everyone to see the personalized attack (I mean adding `if (targetUser()) attackTarget()` isn't that hard, but it would be visible for everyone compared to doing that server-side). But then if I'm a sophisticated attacker I'd be sure to make that way less obvious in my code. My feeling is that it would be detected later rather than sooner if hidden well enough. And that is excluding things like non-official apt repositories.

Is `curl raw.githubusercontent.com/.. | bash` fine for you? I think most of the curl | bash uses github master branch. Using your own domain is actually scary as owner have to be sure that they never lose control of them.
I would never pipe something to bash.

Always download, inspect, run. (maybe even backup if something strange happens)

Really though? What are you looking for in this inspection?

This strikes me as one of those things where the “inspectors” underestimate the security of “curl|bash from a known HTTPS origin” and overestimate their ability to detect anything that could evade that security. At that point you’re dealing with a g0d level hacker, or your cert trust has been broken, and in either of those cases you were already pwned.

I read the script and see if I like what I see.

As example: https://sh.rustup.rs It's really easy to read and useful to understand what it does.

If it's too obfuscated and I can't understand it I don't run it and look for other install options or give up

If I do spot bugs, I'll go to their github and provide a PR.

If I spot something malicious I'll check the github to see who put it in and raise the problem. (if it's not on github then alarm bells)

> repos are mirrored, come with signing keys and any successful attacks are detected sooner or later and become public knowledge.

1. Not all package managers come with signing keys or actually check them.

2. "Sooner or later" - weasel words. Some of these breaches have been discovered years after the fact. Who really cares if they get discovered after 3 years? By that point all the harm has been done plus the attacker could have taken control of the systems in more varied ways so even removing the initial entry point won't save you.

> 1. Not all package managers come with signing keys or actually check them.

Seems like a very big problem with those package manager... Ubuntu as far as I'm aware does proper signing. (as any sane distro and hell, microsoft too)

I would not be using those package managers.

> 2. "Sooner or later" - weasel words.

What's your point?, I trust Ubuntu/Red Hat to keep their keys safe. I trust that google project zero and others would notice anything spooky.

I do not trust a random distro with only a few users to keep their keys safe and I do not use that.

It's also hard to do a proper attack when you have:

ubuntu -> (n) mirrors -> me

Ubuntu can't push a malicious package directed at me (I go via mirrors which can be picked at random)

Mirrors can't push a malicious package directed at me (they would need ubuntu signing keys, and someone would need to own all of them or be very lucky)

And if someone does compromise Ubuntu's keys, they're not going to go after me and risk getting detected that way.

There is a lot more security built into package managers then what I said compared to 0 you get on curl|bash.

That's true if you're using Ubuntu's repos. But a lot of software on Linux comes as a key that you need to tell apt to trust, and then a repo that uses that key. This is just as unsafe, if not more unsafe, than curl | bash - it gives me a way to not just send you malicious code today, but also any other time you apt upgrade.
We are talking on an article that highlights a major flaw of curl|bash.

The website owner can determine if you are just downloading to investigate script or if you are downloading and running.

In the last scenario the owner can decide to give you bad code and you won't know what happened / can't prove that the website owner did anything to you.

With APT the owner cannot see which case it is in, someone can always investigate what is being published by just downloading a package.

Otherwise, as you noted - if you trust the wrong person you will get owned either way, but curl|bash is inherently more dangerous due to easy targeting.

(I can push a package in apt via curl|bash too so it gets upgrade regularly)

While this technique allows an attacker to avoid revealing the exploit if you simply redirect the curl output to a file, it will contain tell-tale information (in this case, bufferloads of zero bytes) allowing one to discern that it is up to no good.

The author hints at other techniques for detecting curl|bash (http or dns callbacks), which would obfuscate but not completely mask the attacker's intentions.

Note that I'm not advocating for using curl|bash: it's a technique for gathering low-hanging fruit, and there's no point in putting yourself in that position.

Quick note - I've had this happen to me.

- browser crash

- I reload last website

- crash again

- I know that site has an exploit - so I try curl to get the payload - it's no longer there.

- I set up wireshark - open up in browser - exploit no longer there.

I'm now stuck with no way to figure out what happened, core dump is useful to prevent the crash but not find the code that triggered it.

So disconnect / fresh install OS.

This kind of targeting can happen now with curl|bash detecting if you install or just download.

It would require somewhat more sophistication on the attacker's part to detect curl|tee|bash being run in a VM, I think. Also, can you start bash with tracing on? Or put awk in the pipeline to turn it on, and also filter out attempts to turn it off?
Package managers include npm, bundler, maven, gradle, cargo, etc, not just distro ones.
and those package managers need to have security built into them as well.
Anyone that dedicated would probably more likely bash you over the head with a rock until you give up your password
why would they do that when they can run a script from halfway around the world and take profit without getting caught?

I'm thinking ransomware attacks, bitcoin mining farms set up on AWS accounts after stealing keys / racking up huge costs, bank account takeovers, stock market account takeovers...

Someone hitting you in the head is easier to avoid / easier to recover funds from. (and if you're in the US and have a gun, that person trying to hit you in the head is going to have a bad time)

I mean they'd more likely do that than target you personally with a curl|bash. That's a very noisy and blatant move, super unlikely to work on anyone techy enough to know what curl and bash is, probably the last resort. Other exploits are on the table and indeed much more likely too
Because piping curl into the bash is just an unnecessary risk that gives you very little benefits (you speed up a setup a bit), while package managers actually help keeping a project update-able and deployable in long terms. In the end we all end up with some sort of compromise between security and usability/maintainability - 100% secure doesn't exist. Trimming as many risks that you can do with out, while keeping the most of the useful functionality is a reasonable strategy for most projects.
Maybe it is because the website owner has full authority to change anything at their discression, while git packages usually exists in an ecosystem that can be observed and tracked.
git allows rewriting history. It doesn't seem unlikley one could come up with an attack which gives a malicious git clone to one user, and then rewrites history so all other users later don't see the maliciousness.
Rewriting history has absolutely nothing to do with this. In a VCS that doesn't allow this, I could just hand out repo1 and repo1+malicious-patch. In both cases (as with git as well), I can detect this by comparing hashes.
> And with something like NPM packages you are likely to download another few dozen of other packages which you didn't even intent on downloading. You will probably run a lot of code there that could be doing all kinds of horrible things.

This got me thinking - how would easy would it be to orchestrate a dependency based attack that would cripple a large number of applications - for example with the help of a maintainer of a popular open-source project gone rogue? Do large tech companies frequently audit the 3rd party code that goes into their applications or is it largely based on trusting the open-source maintainer?

Are you familiar with the left pad incident? One maintainer dropped a bunch of predominantly trivial repos that had a large impact on mom.
Note that for the leftpad incident, the impact was build faillures, not remote code execution.
There was a time when I made a point of only installing from source code, and never even using package managers. Although, of course, it wasn't possible to read through all of the source code and make sure that it wasn't doing anything malicious, this felt safer to me. I eventually had to give up, though, because troubleshooting a failed install from source is damned near impossible, and all the documentation you can find on anything assumes that you're using package managers to install everything.
Not that I personally care that much, but the idea is that using curl|bash you can get incomplete script because of a network error, and "incomplete" can end on any command, like instead of "rm -rf /home/user/.config/program/useless_dir" it could end on "rm -rf /home".
All composer packages are namespaced, and warns if the command is being run as root.

It has its own potential security issues with post-downloadn scripts, but knowing the namespace helps a bit.

curl | sudo bash (the typical use case) means that whatever ass-backwards method of installation the developer thinks makes sense just happens without you being able to put any reigns on it.

For example, Homebrew by default installs everything into /usr/local, but as your user. This is great for single-user systems, but everything goes all the way to hell when someone goes and installs it on a multi-user system and suddenly whatever versions of anything they've chosen to install become the default version for everyone on that system.

For Linux, if you have sudo permissions, it recommends you install it into /home/homebrew/.linuxbrew, which is completely nonsensical; it doesn't create a 'homebrew' user, and it shouldn't store local data in /home/<wherever>/ anyway (use /usr/lib/<wherever> or /var/lib/<wherever>).

Basically, the people who created HomeBrew don't seem to really understand the benefits of not making a complete mess of an existing system.

Compare that with, for example, MacPorts. They have an installer package that you can use on MacOS, or you can just clone the code and do './configure' and pass in whatever options you like. The first is great for the less technical, and the second is great for more technical. They install by default into /opt/local, which I've never seen anything else use, and they help you add the relevant paths to your path so that you can use it, but no one else does by default.

I've also seen other "install shell scripts" which do even worse things. One (I think from the Apache project?) would download a .deb package, if you were on Ubuntu, and then just manually unpack it over top of your existing filesystem. It wouldn't `dpkg -i foo.deb`, it would `dpkg-deb -x foo.deb /`, potentially overwriting anything that shared the same path, and making it impossible to uninstall. It's already a debian package! Just install it normally!

In other words, aside from encouraging the bad habit of "run code from the internet blindly as root", it's extremely, extremely rare that I come across a project which instructs me to do this but doesn't do something incredibly stupid in their script.

> At least with `curl|bash` I get some feedback of where the code is originating from, what URL will I be downloading something from and is that some place that I can trust. At least I get somewhat of an identity verification (albeit very very weak) as long as I trust the owner of the site to protect it adequately from preventing unauthorized uploads.

This isn't even remotely true. That shell script that you downloaded from https://llamasi.te/install might download and install arbitrary binary packages, binaries, config files, etc. from anywhere on the internet. It might install an older version of npm with security holes, overwrite your local node installation, and then download a bunch of npm packages with pinned versions full of exploits.

Unless you stop and read through their shell script to see specifically what they do, you have literally no idea what is going to happen with your system, and if you're going to stop and read their shell script it's probably significantly faster to just provide you with a list of prerequisites and a few commands to run, rather than make you read through a shell script full of if/else/fi to check which versions of sed and awk you have and where they are, just so that it can use them to parse out version information from other tools that you wouldn't need to use.

Basically, when you curl|bash, you're assuming that the other person is trustworthy and knows what they're doing, and while you can make the determination of #1 fairly quickly, it takes a lot more time and energy to determine #2.