Hacker News new | ask | show | jobs
by nfm 1680 days ago
Because this is buried in the post and people don't seem to be grokking it:

> Second, on November 2 we received a report to our security bug bounty program of a vulnerability that would allow an attacker to publish new versions of any npm package using an account without proper authorization.

They correctly authenticated the attacker and checked they were authorised to upload a new version of their own package, but a malicious payload allowed the attacker to then upload a new version of a completely unrelated package that they weren't authorised for. Ouch!

5 comments

> However, the service that performs underlying updates to the registry data determined which package to publish based on the contents of the uploaded package file

Yeah, this is what's going to keep me up tonight. Yikes.

I can't help but wonder if the root cause was HTTP request smuggling, or if changing package.json was enough.

How do we even mitigate against these types of supply-chain attacks, aside from disabling run-scripts, using lockfiles and carefully auditing the entire dependency tree on every module update?

I'm seriously considering moving to a workflow of installing dependencies in containers or VMs, auditing them there, and then perhaps commiting known safe snapshots of node_modules into my repos (YUCK). Horrible developer experience, but at least it'll help me sleep at night.

How do we even mitigate against these types of supply-chain attacks

Don’t import thousands of modules from third parties just to write a simple web app. If you have 10 stable dependencies it’s no problem to vendor them and vet changes. If you have 10k you’ve entirely given up on any pretence of security.

Recently Node 16 LTS cycle started. One month and a few days before the carry-over, a super controversial package titled `coredeps` [0] was officially declared a core module and has been bundled with all official distributions since.

The NodeJS team refuses to discuss NPM because it's a separate 3rd party. And yet.... this NodeJS Core module comes pre-installed as a global NPM package.

We're just getting started.

This module installs or even reinstalls any supported package manager when you execute a script with a name that would match any that they'd recognise. Opt-in for only a short period, and intending to expand beyond package manager installations.

Amidst all that's been going on, NPM (Nonstop Published Moments) is working on a feature that silently hijacks user commands and installs foreign software. The code found in those compromised packages operated in a similar manner and was labeled a critical severity vulnerability.

The following might actually make you cry.

Of these third party remote distributions it's downloading, the number of checksum, keys, or even build configurations that are being verified is 0.

The game that Microsoft is playing with their recent acquisitions here is quite clear, but there's too much collateral damage.

[0] https://github.com/nodejs/corepack#readme

Not that I agree with the methodology running `corepack enable` introduces, providing OS shims for the specific package manager commands to download them...

corepack (or package manager manager) was transferred to be a Node.js foundation project, voted to be included in release by the Node.js Technical Steering Committee. The one member I'm aware is affiliated with Github/NPM abstained from the vote. The specific utility of corepack is being championed by the package managers not distributed with node so that (Microsofts) `npm` is not the single default choice.

I'm interested to hear what parts of this you see as coming from Microsoft/NPM as I didn't get that vibe? In my view this was more likely reactionary to the Microsoft acquisitions (npm previously being a benign tumour, doctors are now suggesting it may grow :)

I think Corepack is a bad idea and have explicitly added feedback to say so. That said, I know you're misrepresenting the situation (whether intended or not) by suggesting this is a Microsoft initiative (it's not, Microsoft acquired NPM, if anything is even relevant to that acquisition this is meant to distance Node from that initiative).
Whether this is entirely by design I don't know, but Microsoft's positioning in the ecosystem is just brilliant. They're like a force of nature now.

NPM's security issues prime the ecosystem for privacy and security topic marketing (ongoing, check their blog), which is leveraged to increase demand for Github's new cloud-based services.

In the meantime they will just carry on moving parts of NPM to Github until there's so little of the former left, that it'll be hard to justify sticking with it rather than just moving to Github's registry like everyone else.

Eventually NPM gets snuffed-out and people will either be glad it's finally gone, or perhaps not even notice.

To reiterate on what sibling comments said, I'm the one who spawned the discussion and implementation of Corepack, and npm remained largely out of it; the push mostly came from pnpm and Yarn.

Additionally, unlike other approaches, Corepack ensures that package manager versions are pinned per project and you don't need to blindly install newest ones via `npm i -g npm` (which could potentially be hijacked via the type of vulnerability discussed here). It intends to make your projects more secure, not less.

If anything this makes it worse.

- No security checks are present in the package manager download and installation process so there are still no guarantees.

- Existing installations of package managers are automatically overwritten when the user calls their binary. What if this was a custom compilation or other customisations were made?

- This solution does a lot more behind the scenes than just run that yarn command that the user asked for but hand't installed.

- Why not simply notify the user when their package manager isn't installed or only allow it with a forced flag? (As has been suggested uncountable times by numerous people anywhere this topic came up over the years.)

Disrespecting user autonomy, capacity to self-regulate, and ownership over their machine and code is not the way.

Edit: Formatting

People don't directly import thousands of modules. It's actually a lot closer to your "10 stable dependencies". But those dependencies have dependencies that have dependencies. It's a little hard to point the finger at application developers here, IMO.
Some of the comments in this thread are wild. Huge dependency trees are bad pattern, plain and simple.

The problem isn’t only ridiculous amounts of untrusted code, but thousands of new developers of the last 10 years who think this is the way to write reliable code. Never acknowledged the risks of having everyone write your code for you, and overestimate how unique and interesting their apps are.

If you must participate in this madness, static analysis tools exist to scan your 10000 dependencies, taking security seriously is the issue.

> Huge dependency trees are bad pattern, plain and simple.

And what's the alternative? Do you write your own libraries to store and check password hashes complete with hash and salt functions? Roll your own google oauth flow? Your own user session management library?

It's madness on either side, the difference is `npm install` and pray allows you to actually get things done

A large standard library is a big part of the solution. Your project may pull in a crypto library that includes password hashing, and an oauth library, and a session management library, but all of those libraries will have few or no dependencies outside of the standard library.
When vetting a dependency consider whether it depends on packages you already depend on, if new dependency tree is too large try breaking down desired functionality—multiple lower level smaller direct dependencies may have lighter overall footprint, take them and some built-ins and some of your own glue code and you get the same thing with fewer holes.

More tangentially, use persistent lockfiles and do periodic upgrades when warranted (e.g. relevant advisories are out) and check new versions getting installed.

You use a trusted standard library which has crypto functions (and lots of other helpers) and for small things you write your own.

Yes you can write your own things like session management, yes that is better than the entire web depending on a module for session management which depends on a module which depends on a module maintained by a bored teenager in Russia.

Please do check out other ecosystems, there is another way.

> And what's the alternative?

Using a small number of libraries, where each library provides a large amount of functionality. When I install Django, for instance, four packages are installed, and each package does a substantial amount of work. I don't have to install 1000 packages where each package is three lines of code.

When I'm writing a C program I can somehow depend on only one library for password hashing and one for oauth (maybe two if it also needs curl). In javascript land it's probably a couple dozen, probably from a couple dozen different people.
> static analysis tools exist to scan your 10000 dependencies

Maybe this is a dumb question but could you please suggest some of these tools that can scan dependencies?

Most of those dependencies have well defined, stable api. They use or at least try to follow semver. And you're probably only hitting about 10% of your dependencies on the critical path you're using, meaning that a lot of potentially vulnerable code is never executed.

I get the supply chain attacks. I get that you have a tree of untrusted javascript code that you're executing in your app, on install, on build and in runtime. But there's also Snyk and Dependabot which issue you alerts when your dependency tree has published CVEs.

We can talk about alert fatigue, but to be honest, I feel more secure with my node_modules folder than I do with my operating system and plethora of DLLs it loads.

I don't wanna turn this into a whataboutism argument, but at some point you gotta get to work, write some code and depend on some libraries other people have written.

> And you're probably only hitting about 10% of your dependencies on the critical path you're using, meaning that a lot of potentially vulnerable code is never executed.

If a dependency has been compromised it doesn't matter if its code is actually used, since it can include a lifecycle script that's executed at install-time, which was apparently the mechanism for the recent ua-parser-js exploit.

> Snyk and Dependabot

Wait, I’m not safe using “npm audit”?

Semver will not save you.
This is the direct result of the culture of tiny dependencies in JS and some other languages, but not all ecosystems are like this. If you choose to use node, this is where you end up, but it was a choice.

Many languages have a decent standard library which covers most of the bases, so it’s possible to have a very restricted set of dependencies.

Unfortunately for frontend and the Node ecosystem, it's too late to try and put the toothpaste back in the tube.

Hopefully Deno helps with this pain point.

I mean, you say that, but the practice of pulling in so many dependencies is fairly recent. It wasn't even possible for most projects before everyone had fast internet.
> It's a little hard to point the finger at application developers here, IMO.

I disagree. Any application developer who seriously thinks that they only have 10 dependencies if they're only importing directly 10 dependencies should not be an application developer in the first place.

You sure about that? Even if you’re writing just for a vetted distribution of an OS, and you write code with zero explicit dependencies, you still have much more than zero dependencies. It’s turtles all the way down. The key is to have an entire ecosystem that you can, to some degree more or less, trust.
No. we've been shouting warnings for years. There have been dozens, if not hundreds of threads on HN alone warning of supply-chain security threats.

At this point if you're not actively auditing your dependencies, and reducing all of them where you can, then you're on the wrong side of history and going down with the Titanic.

The frank truth is that including a dependency is, and always has been, giving a random person from the internet commit privileges to prod. The fact that "everyone else did it" doesn't make it less stupid.

> Even if you’re writing just for a vetted distribution of an OS, and you write code with zero explicit dependencies, you still have much more than zero dependencies.

Sure, the entire OS is a dependency. Nothing I said contradicts that. And yes, every application developer should be aware of what they are depending on when they write software for a particular OS.

> The key is to have an entire ecosystem that you can, to some degree more or less, trust.

You don't necessarily need to trust an entire ecosystem, but yes, every dependency you have is a matter of trust on your part; you are trusting the dependency to work the way you need it to work and not to introduce vulnerabilities that you aren't aware of and can't deal with. Which is why you need to be explicitly aware of every dependency you have, not just the ones you directly import.

I only imported 10 dependencies, but those 10 dependencies each had 10 dependencies which each had 10 dependencies which each had 10 dependencies and all of the sudden I'm at 10k dependencies again...
The transitive dependency chain should be part of your evaluation of a library. Frameworks are special cases, for sure. But if you’re adding a dependency and it adds 10,000 new entries to your lock file, that should be taken into consideration during your library selection process. Likewise, when upgrading dependencies, you should watch how much of the world gets pulled in.

That said, I don’t know what the answer is for JS. There are too many dependency cycles that make auditing upgrades intractable. If you’re not constantly upgrading libraries, you’ll be unable to add a new one because it probably relies on a newer version of something you already had. In most other ecosystems, upgrading can be a more deliberate activity. I tried to audit NPM module upgrades and it’s next to impossible if using something like Create React App. The last time I tried Create React App, yarn-audit reported ~5,000 security issues on a freshly created app. Many were duplicates due with the same module being depended on multiple times, but it’s still problematic.

That's going to be incompatible with writing interesting software on the web, unless we want to just hand the problem over to a handful of big players who can afford to hand-vet 10,000 dependencies.

The reason packages are so big is the complexity for an interesting app is irreducible. People don't import thousands of modules for fun; they do it because simple software tends towards requiring complex underpinning. Consider the amount of operating system that underlies a simple "Hello, world!" GUI app. And since the browser-provided abstractions are trash for writing a web app, people swap them out with frameworks.

I'm working on a React app right now where I've imported about a dozen dependencies explicitly (half of which are TypeScript @type files, so closer to a half-dozen). The total size of my `node_modules` directory is closer to a couple hundred packages. It's 35MB of files. And no, I couldn't really leave any of them out to do the thing I want to do, unfortunately.

People oftentimes do this, with suspicious reasoning. Classic examples:

1) "We have is-array as a dependency" Why? Well, pre Array.isArray, there wasn't anything built-in. Why not just write a little utility function which does what is-array does? See #3

2) "We have both joi and io-ts. Don't they do roughly the same thing?" They do; io object validation. New code uses io-ts, but a bunch of old code relies on joi. Should we update it? Eh we'll get around to it (we never do).

3) "is-array is ten lines of code. why don't we just copy-paste it?" Multiple arguments against this, most bad. Maybe the license doesn't support it. More usually; fear that something will change and you'll have to maintain the code you've pasted without the skills to do so. Better to outsource it (then, naturally, discount the cost of outsourcing).

4) "JSON.parse is built-in, but we want to use YAML for this". So, you use YAML. And need a dependency. Just use JSON! This is all-over, not just in serialization, but in UI especially; the cost analysis between building some UI component (reasonably understood cost) versus finding a library for it (poorly understood cost, always underestimated).

Not all dependency usage is irreducible. Most is. But some of it is born, fundamentally, out of a cost discount on dependency maintenance and a corporate deprioritization of security (in action; usually not in words).

The counterpoint is all the security issues generated when dev teams re-implement the already-well-implemented. Your points are valid, but as with anything, it is not cut and dry.
If your software is ultimately dependent on thousands of other modules from various developers all over the Internet, you have no idea whether what you're depending on is actually well implemented or not.
When I'm writing desktop software, I don't have to worry about whether yaml adds a dependency that I can't afford to maintain.

People who develop web apps want that level of convenience. And if we can't solve the security problem in a distributed fashion, web development will end up owned by big players who can pay the money to solve the problem in a centralized fashion.

> When I'm writing desktop software, I don't have to worry about whether yaml adds a dependency that I can't afford to maintain.

Why not? Because some big, centralized player has put the time, effort, and money into making yaml part of a complete library that gives you everything you need to write desktop software. Nobody writes desktop software by importing thousands of tiny libraries from all over the Internet.

> why don't we just copy-paste it? ... Maybe license doesn't support it.

You did say the argument was bad, but a license that prevents you from making a copy manually but allows you to make a copy though the package manager isn't a thing, is it? In either case the output of your build process is a derived work that needs to comply with the license.

Unless, perhaps, you have a LGPL dependency that you include by dynamic linking (or the equivalent in JS – inclusion as a separate script rather than bundling it?) in a non-GPL application and make sure the end user is given the opportunity to replace with their own version as required by the license.

> The reason packages are so big is the complexity for an interesting app is irreducible

These kinds of claims demand data, not just bare assertions of their truthiness.

Firefox, as an app with an Electron-style architecture (before Electron even existed), was doing some pretty interesting stuff circa 2011 (including stuff that it can't do now, like give you a menu item and a toolbar button that takes you to a page's RSS feed), with a bunch of its application logic embodied in something like well under <250k LOC of JS.

The last time I measured it, a Hello World created by following create-react-app's README required about half a _gigabyte_ of disk space between just before the first `npm install` and "done".

That NPM programmers don't know _how_ to write code without the kind of complexity that we see today is one matter. The claim that the complexity is irreducible is an entirely different matter.

Firefox's 250k LOC are riding on the millions of lines of code of the underlying operating system and GUI | TCP | audio toolkits that it used. To compare it to npm development, you would need to factor in the total footprint of every package that you had to install to compile Firefox in 2011.

... And I think it's an interesting question to ask why we can trust the security of, say, Debian packages and not npm, given how many packages I have to pull down to compile Firefox that I haven't personally vetted.

> Firefox's 250k LOC are riding on the millions of lines of code of the underlying operating system and GUI | TCP | audio toolkits that it used.

Right, just like every other Electron-style app that exists. The comparison I made was a fair one.

> To compare it to npm development, you would need to factor in the total footprint of every package that you had to install to compile Firefox in 2011.

No, you wouldn't. That's a completely off-the-wall comparison.

How many lines of application code (business logic written in JS including transitive NPM dependencies before minification) go into a typical Electron app in 2021? Into a medium sized web app? Is the heft-to-strength ratio (smaller is better) less than that of Firefox 4, about the same, or ⋙?

> The reason packages are so big is the complexity for an interesting app is irreducible.

This is absolutely, demonstrably false. Can you really claim that you use 100% of the features provided by all of the dependencies you pull in? If not, you are introducing unnecessary complexity to your code.

That doesn't mean that this is necessarily a bad thing, or that we should never ever introduce incidental complexity—we'd never get anything done if that was the case. My point is simply that there exists a spectrum that goes from "write everything from scratch" on one end all the way to "always use third-party code wherever possible" on the other. It's up to you to make the tradeoff of which libraries are worth pulling in for a given project, but when you use third-party code, you inevitably introduce some amount of complexity that has nothing to do with your app and doesn't need to be there.

I don't use 100% of the features I pull in. But I also don't use 100% of the features of libc or gtk if I'm building a GUI app in C.

I have 35 MB of node_modules, but after webpack walks the module hierarchy and tree-shakes out all module exports that aren't reachable, I'm left with a couple hundred kilobytes of code in the final product.

> But I also don't use 100% of the features of libc or gtk if I'm building a GUI app in C.

That’s exactly my point. This is a tradeoff that’s inherent to software development and has nothing to do with the web or Node or NPM. You could just as well decide to write your desktop app with a much smaller GUI library, or even write your own minimal one, if the tradeoff is worth it to reduce complexity. (Example: you’re writing an app for an embedded device with very limited resources that won’t be able to handle GTK.)

> browser-provided abstractions are trash for writing a web app

This is the key.

If browsers would improve here we wouldn't need half of the dependencies that we use now. It took nearly a decade to get from moment.js to some proper usable native functions for example.

Besides that we _really_ need to solve the issue of outdated browsers. Because even when those native APIs exist we'll need fallbacks and polyfills and lots of devs will opt for a non-standard option (for various reasons).

The web is still a document platform with some interactivity bolted on top, I love it but it's a fucking mess.

Without more information this mindset is stuck where the web platform was maybe a decade or more ago. Roughly a dog or cat lifetime. Consider the list APIs at https://developer.mozilla.org/en-US/docs/Web/API I'd be curious to know if anyone active on HN could actually say they have proficiency with the entire list. Professionally speaking I wouldn't call that a mess. I'd call it a largely unused and unexplored opportunity.
Somehow people managed to develop useful software before NPM and node and so on, without having thousands of very small dependencies. Maybe it's because the stuff built in to Javascript is nearly useless? And the older languages had a standard library that included most of the useful stuff you'd need to build something?
Ruby, Python, Go, Rust, etc all have this exact same problem; it's not unique to NPM.

JS has a culture of using lots of small, composable modules that do one thing well rather than large, monolithic frameworks, but that's only an aggravating factor; it's not the root of the problem.

The root problem is no stdlib and a language design riddled with edge case foot guns that are easy to miss in what should be trivial code.
They do not, they have capable and trusted standard libraries and it’s quite possible to build a web app in those other languages without any external dependencies whatsoever.

JS and its culture of small dependencies that do one thing but import 100 other things to do that thing is the root of the problem here.

> that do one thing well

And sometimes even something the language already does, but the author didn’t know.

Part of that was that we didn't make major changes to how we did things every other project back then. If we needed to do X and that wasn't built in to the language or standard library we were using we would either write our own X library or we could take the time to carefully evaluate the available third party X libraries and pick a high quality one to use. We could justify spending the time on that because we knew we'd be taking care of not just our immediate X needs but also the X needs for our next few years worth of projects.
BTW, you can build a lot of interesting things with jQuery alone.
That's going to be incompatible with writing interesting software on the web

Lots of people are writing interesting web software without these problems - the website you’re currently posting on is one example. So I completely disagree with this statement and think you need to examine your assumptions.

There is life outside npm.

"Interesting" was a bad choice for specificity here on my part. By the definition I mean, HN isn't interesting... It's got interesting content, but the UI is a dirt-simple server-side-generated web form.

OpenStreetMap is "interesting." Docs and Sheets are "interesting." Autodesk Fusion 360 is "interesting." Facebook is "interesting." Cloud service monitoring graph builders are "interesting." The Scratch in-browser graphical coding tool is "interesting." Sites that are pushing the edge of what the browser technology is capable of are "interesting."

None of the sites you mention above would require npm to build.

At some stage after you've seen enough 'interesting' dependencies changing the world around your app as you write it you'll realise that boring is good for most of the tech you depend on - the more boring the better, and the fewer dependencies the better.

You might be surprised how small a team it took to produce microsoft office 2000 (last good version), or windows nt kernel, or WhatsApp.

One need not be a big player to write good code without 10000 dependencies

I have to think there's a lot of YAGNI going on, dependencies that are included to be a better version of native functionality. A faster JSON parser, say, with I dunno, 20 dependencies (a count which may further extend within those deps) for something where slow JSON parsing has not yet become an issue. I think there's a lot of "academic" inclusions out there like this.
My experience working on tens of front end projects is the complete opposite. Nobody is adding dependencies just for the fun of it, or because you might need it in a year. You add a dependency because you need some functionality and there is no time/budget to re-do it in house - not to mention that if it's a well-supported library with, for example, hundreds of thousands of users, it's unlikely you could even make it better.
> there is no time/budget to re-do it in house

What are the actual time cost savings when you take the total costs into consideration?[1][2] What would it look like if you didn't implement an app by stringing together dozens/hundreds/thousands of third-party modules implemented bottom-up, but instead took control of the whole thing top-down?[3]

1. https://jvns.ca/blog/2021/11/15/esbuild-vue/

2. https://news.ycombinator.com/item?id=24495646

3. https://www.teamten.com/lawrence/programming/write-code-top-...

Then you are shit out of luck and vulnerable to supply-chain attacks. Good luck with that.
Well, that's what I'm wondering. GNU/Linux distros like Debian and Ubuntu don't seem to suffer supply chain attacks, but it's not entirely clear to me why. Is it because the distros are more carefully curated, and the infrastructure for extending them older so it has had more time to wrestle security concerns to the ground?

Or is it, disquietingly, the possibility that they are completely vulnerable to this sort of attack and either nobody has noticed there compromised or attackers haven't decided that compromising a major desktop Linux distro is worth the time?

https://www.zdnet.com/article/open-source-software-how-many-...

Distributions like Debian are _highly_ aware of supply chain attacks. That's one of the key reasons for projects like Reproducible Builds [0] and rekor [1] existing.

So yes, distributions are carefully curated, with a large team of experts vetting the system in a huge number of ways, and are always looking to improve upon them. Because attackers are actively attempting to compromise major distributions.

[0] https://wiki.debian.org/ReproducibleBuilds

[1] https://lwn.net/Articles/859965/

Unfortunately most modern JavaScript tooling has made this very difficult. Before you even have a "hello world" app running create-react-app et al. will install literally a thousand random packages. It's already over.
Maybe 10 stable dependencies without dependencies? Otherwise it's dependencies all the way down.

Is vendoring in a dependency just slowing things down? Slows down development and bakes existing attacks in longer.

What’s the alternative? Writing everything in house? I think a better solution would be a better dependency installer/resolver that is as secure as possible.
> What’s the alternative?

Don't use the popular hype garbage. Yes, I realize that may not be an option for a lot of people professionally. But I believe if you actually spend some time on due diligence for any dependency you consider adding, you can significantly reduce the number of untrusted deps you pull in.

One of the problems of course is that javascript exacerbates this problem somewhat by not having a comprehensive standard library. But whenever I look for go libraries, go.sum is usually one of the first files I click to check how much garbage it pulls in.

Standard library is a dependency too and can have bugs in it. What's better - having stdlib tied to the runtime release schedule or having a lot of micro libraries on their own rolling release schedule which can quickly release security patches?

I agree, having those dependencies authored by Node.js Foundation itself will yield higher level of trust. But we're all human, and one can argue earnest open source developers have better aligned incentives than a randomly selected Node.js Foundation employee.

I honestly am not sure I fully agree with what I've just written above either. But one thing I would want to pinpoint: those things are NOT black and white. The specific set of trade offs the Node.js ecosystem fallen into might look accidental and inadequate. But I think it's fairly reasonable.

Yes using a standard library is better. It is more stable, trustworthy and maintained by a small group of people.
I would be with you, but leftpad was a thing. Anyone importing leftpad (or any of the similar 5-line dependencies) has no leg to stand on here.

Yes, you should write leftpad in-house. Anything that is a copy-paste Stack Overflow answer should not be dependency.

You’re not wrong, I’ll admit, but if we judge everything by the most extreme examples we’d still be writing assembly and only mathematicians would be programmers. I’m sure there’s a universe where that’s the case, and I’m sure there’s a percentage of people here who wish that were the case, but I’d say the world is better off with separation of skill sets and I’d rather leave the writing of libraries to people who enjoy writing them and can do it well.
How about we just go back to writing all the trivial stuff in house?

Nobody is suggesting we each write our own charting library, but we should each be capable of writing that function that picks a random integer between 10 and 15. Because the npm version of that function will have the four thousand dependencies that everybody likes to mock whenever npm is discussed.

Other People’s Javascript is generally pretty terrible. My policy is to only use it when absolutely necessary.

Frameworks and library authors could stand to do more in-house. It's also on devs to vet a library for maintenance concerns like sprawling dependencies.
Or a very large dep like apache commons in java that you can trust rather than one dependency for zip compression, one dependency for padding, one dependency for http error codes and so on ?
That's essentially what a Linux distribution is.
How do you police what your imports import? Serious question. Let's say I'm building a Discord app (as I want to do.) Well, either NPM or Python PIP to get one module - the discord module. But who knows how safe what it imports is. That's the point.

Are there stable dependencies from reputable companies that do the things I want without me vetting 10k submodule imports?

It may require picking a different language with a different culture. JS badly needs a more capable standard library.
That's the crux of the matter. Server-side you can, and should, choose a different platform than Node.js but for the browser we're all stuck with JS. A more capable standard library, where vetting everything would be much more feasible, would do much to improve the situation.
I somewhat naively assume that at least if I use plain React or Angular then

- someone at Facebook or Google has vetted the dependcy graph for those

- I also assume they have internal Snyk-like tools

- I also assume other users have similar tools

so someone should catch it.

When it comes to anything else I often look into what it pulls in.

Also I keep an eye on the yarn.lock-file in pull requests.

> so someone should catch it.

Just a week or two ago, a malicious NPM package was published which, for the hour or so that it was up, would be pulled in by any installation of create-react-app, since somewhere in the dependency tree it was specified with “^” to allow for minor updates.

Any machine that ran “npm -i” with CRA or who knows how many other projects during that hour may have compromised credentials.

1 hour to find and unpublish the malicious package is a fast turnaround time, so someone was watching and that’s great. But any NPM tree that includes anything other than fully-specified and locked versions all the way down the tree is just waiting for the next shoe to drop.

So my specific usecase (write a Discord bot) has the solution of "write everything from scratch" or "don't use JS"?

That's kinda what I assumed, but "only run code that have been signed off on by a major company" is kinda a shitty solution.

This requires that you're pulling in only exactly the same versions of those dependencies as those that Facebook and Google have vetted. Is there a way to do that?
A combination of things, I think.

1. Running those builds in VMs is a good idea.

2. Monitoring for weird behavior.

3. Restricting build scripts from touching anything outside of the build directory.

4. Pressuring organizations like npm to step up their security game.

It would be really nice if package repositories:

1. Produced a signed audit log

2. Supported signing keys for said audit log

3. Supported strong 2FA methods

4. Created tooling that didn't run build scripts with full system access

etc etc etc

I started working on a crates.io mirror and a `cargo sandbox [build|check|etc]` command that would allow crates to specify a permissions manifest for their build scripts, store the policy in a lockfile, and then warn you if a locked policy increased in scope. I'm too busy to finish it but it isn't very hard to do.

Thanks. I was thinking of a CI step that checked the SHA-256 of yarn.lock against a "last known good" value committed by an authorized committer and enforced by a branch policy.

Signed audit logs seem like a good idea.

Now...how to get developers to avoid using NPM and Yarn altogether on sensitive projects...

>How do we even mitigate against these types of supply-chain attacks

I know HN is usually skeptical of anything cryptocurrency/blockchain related, and I am too. But as weird as it sounds, I think blockchain might actually be the solution here.

The problem with dependency auditing is it's a lot of work. And it's also duplicate work. What you'd really like to know is whether the dependency you're considering has already been audited by someone you can trust.

Ideally someone with skin in the game. Someone who stands to lose something if their audit is incorrect.

Imagine a DeFi app that lets people buy and sell insurance for any commit hash of any open source library. The insurance pays out if a vulnerability in that commit hash is found.

* As a library user, you want to buy insurance for every library you use. If you experience a security breach, the money you get from the insurance will help you deal with the aftermath.

* As an independent hacker, you can make passive income by auditing libraries and selling insurance for the ones that seem solid. If you identify a security flaw, buy up insurance for that library, then publicize the flaw for a big payday.

* A distributed, anonymous marketplace is actually valuable here, because it encourages "insider trading" on the part of people who work for offensive cybersecurity orgs. Suppose Jane Hacker is working with a criminal org that's successfully penetrated a particular library. Suppose Jane wants to leave her life of crime behind. All she has to do is buy up insurance for the library that was penetrated and then anonymously disclose the vulnerability.

* Even if you never trade on the insurance marketplace yourself, you can get a general idea of how risky a library is by checking how much its insurance costs. (Insurance might be subject to price manipulation by offensive cybersecurity orgs, but independent hackers would be incentivized to identify and correct such price manipulation.)

The fact that there is actual value here should give the creator a huge advantage over other "Web 3.0" crypto junk.

This is a pretty clever application of DeFi, thanks. DeSec? Can't help but wonder if there still would be incentive for lone wolves to slip backdoors and vulnerabilities into libraries though[0].

[0]: https://portswigger.net/daily-swig/smuggling-hidden-backdoor...

> I can't help but wonder if the root cause was HTTP request smuggling, or if changing package.json was enough.

Maybe I'm just incredibly cynical from my experiences with the intersection of the JS ecosystem and security, but...

...I'd bet dimes to dollars it's the latter (just changing the package.json). My guess is they authenticate but don't actually scope the authentication properly, and no one noticed because no one thought to look.

Of course, as we've seen in the past decade, there's so much inertia behind the JavaScript ecosystem that none of this is going to fundamentally change. It'll just take another decade or so for the ecosystem to reinvent all of the wheels and catch up to the rest of the space.

And at that point it will probably be considered stuffy and "enterprise" and the new hotness unburdened from such concerns will repeat the cycle again.

> to reinvent all of the wheels and catch up to the rest of the space.

Which of the public package systems are the state of the art that should be replicated?

The 'wheels' might simply be having a standard library and less number of packages instead of micropackage mess.

For example, look at django, it provides more functionality (though not directly comparable to) than react. But installation is quick and there are small number of packages from trusted authors.

The ecosystem is orthogonal to how good package manager is.

Java’s works really well.

I think it makes other package managers look like a toy.

I assume you are referring to Apache Maven tooling (and compatible) and the pom repos, like Sonatype's Central Repo.

PGP package signing is a huge plus. Is that a requirement for publishing?

How many different repo's do you typically have to deal with in the average project?

Would Sonatype react quickly to malware issue's like this in the repository? Have there been examples of similar package hijacking?

It’s a requirement for the central repo if I recall.

And the best past is the signature handling is a part of Java, not the package manager, so nothing needs to be re-invented. The default class loader checks the signatures at runtime as well.

Typically you need 1-2 repositories, but often just 1. But if you’re an organization, you can set up your own repository very easily and use it to store private deps and to cache deps (which also allows you to lock binaries and work offline). Repo mirroring is super easy to set up. If you have an internal repo, you can just have your internal project use your own repo and your computer never has to directly reach outside the Internet for a package.

Unlike other languages, the “central repo” and the package manager tooling are independent and package resolution is distributed. When you start a project, you choose your repos. I don’t know how quickly Sonatype would react personally but they are only default by de facto. Many packages are published on several repos and mirroring is a default feature of a lot of repo software. If Sonatype started screwing up, everyone could abandon them instantly, which forces them to be better.

I'm seriously considering moving to a workflow of installing dependencies in containers or VMs, auditing them there, and then perhaps commiting known safe snapshots of node_modules into my repos (YUCK). Horrible developer experience, but at least it'll help me sleep at night.

I have had people tell me in discussions online, also entirely seriously, that running a package manager to install a dependency while developing is inherently dangerous and anyone who does it outside of a disposable sandboxed VM deserves everything they get. If the packages are inexplicably allowed to do arbitrary things with privileged access to the local system without warning at installation time then clearly the first part is correct, but victim-blaming hardly seems like a useful reaction to that danger.

>How do we even mitigate against these types of supply-chain attacks, aside from disabling run-scripts, using lockfiles and carefully auditing the entire dependency tree on every module update?

Don't trust the package distribution system - use public key crypto.

Public key crypto doesn't help much if your private keys get stolen, which was essentially what happened with some of the recent hacked packages and which is why they're now starting to enforce 2FA.
The longer term solution to this is public key signatures with an ephemeral key, rooted to some trusted identity source (e.g., a GitHub account with strong 2FA). There’s lots of work on that front coming out of the Open Source Security Foundation.
are you really using private keys without a passphrase in 2021?
It s very easy: add a dev signature in the repo that cannot be changed ever, and force the devs to sign their stuff before allowing a change of binary or a download.

Like that you can have anything trying to upload but fail the signature check.

This assumes that the developers themselves are not malicious (see: left-pad) and that their signing keys can't be stolen.
Also: "This vulnerability existed in the npm registry beyond the timeframe for which we have telemetry to determine whether it has ever been exploited maliciously."
Having different services trust different (and unrelated) bits of the request is an immortal classic though, great stuff.
The part that made sure the user could update the package could have at least check if the payload is about that package before passing it to the service that trusted it.
That one, combined with the other “ability to read names of private packages, makes for the possibility of a really really sneaky attack. I wonder how many orgs treat their private npm packages with significantly less scrutiny than the public ones they rely on?
No CVE mentioned. Hard to grok to me, could someone educate me why this is missing in the blog post?
services don't get CVEs.
Well, didn't we just experience two major npm published packages containing malware? Both had CVEs.

Now we have the probable root cause, buried in a wall of text. No CVE.

CVEs alert end users that they need to take action to apply updates. That's relevant when a specific npm package contained a known vulnerability. It's not relevant when the npm server contained a known vulnerability. There's nothing a user of npm can do to update the npm server.

CVEs don't just mean "this is a big security problem".

hehe...

CVE: "the entire javascript/ruby/python development model is insecure"

affected: "the whole damn internet"

resolution:"rewrite the last 10 years of internet developmet from scratch"

not sure that's gonna happen

At least the npm packages outside their telemetry horizon should be updated immediately.
Yes, because pure services don't get CVEs. CVEs are for distributed software.
Isn't this the biggest security flaw in the package ecosystem ever?

They don't even know when, if, who and when this was exploited, but maybe I didn't pay enough detail attention to the few paragraphs devoted to the real problem.

So shoudn't we assume all NPM packages published prior to 2nd of November are compromised?

And if so, shouldn't this deserve a CVE? (https://en.wikipedia.org/wiki/Common_Vulnerabilities_and_Exp...)

CVEs aren't usually assigned for "there might be something wrong", but only identified specific issues.