Hacker News new | ask | show | jobs
by vosper 1680 days ago
People don't directly import thousands of modules. It's actually a lot closer to your "10 stable dependencies". But those dependencies have dependencies that have dependencies. It's a little hard to point the finger at application developers here, IMO.
4 comments

Some of the comments in this thread are wild. Huge dependency trees are bad pattern, plain and simple.

The problem isn’t only ridiculous amounts of untrusted code, but thousands of new developers of the last 10 years who think this is the way to write reliable code. Never acknowledged the risks of having everyone write your code for you, and overestimate how unique and interesting their apps are.

If you must participate in this madness, static analysis tools exist to scan your 10000 dependencies, taking security seriously is the issue.

> Huge dependency trees are bad pattern, plain and simple.

And what's the alternative? Do you write your own libraries to store and check password hashes complete with hash and salt functions? Roll your own google oauth flow? Your own user session management library?

It's madness on either side, the difference is `npm install` and pray allows you to actually get things done

A large standard library is a big part of the solution. Your project may pull in a crypto library that includes password hashing, and an oauth library, and a session management library, but all of those libraries will have few or no dependencies outside of the standard library.
Every time this discussion comes up about JavaScript ecosystem and the "problems" the solution everyone brings to the table is "Have a large standard library"

You know JavaScript doesn't have one, don't you? That is why this "issue" exists. Putting the cat back in the bag is impossible.

When vetting a dependency consider whether it depends on packages you already depend on, if new dependency tree is too large try breaking down desired functionality—multiple lower level smaller direct dependencies may have lighter overall footprint, take them and some built-ins and some of your own glue code and you get the same thing with fewer holes.

More tangentially, use persistent lockfiles and do periodic upgrades when warranted (e.g. relevant advisories are out) and check new versions getting installed.

You use a trusted standard library which has crypto functions (and lots of other helpers) and for small things you write your own.

Yes you can write your own things like session management, yes that is better than the entire web depending on a module for session management which depends on a module which depends on a module maintained by a bored teenager in Russia.

Please do check out other ecosystems, there is another way.

> And what's the alternative?

Using a small number of libraries, where each library provides a large amount of functionality. When I install Django, for instance, four packages are installed, and each package does a substantial amount of work. I don't have to install 1000 packages where each package is three lines of code.

When I'm writing a C program I can somehow depend on only one library for password hashing and one for oauth (maybe two if it also needs curl). In javascript land it's probably a couple dozen, probably from a couple dozen different people.
It's not two if you need curl. Curl have a large number of dependencies, only difference is that it's visible with npm.
How many developers write C programs versus how many developers write JS apps?

Without accounting for that, your comparison makes no sense! Not even mentioning that you’re comparing two very different level languages. A low level language like C would never behave like a high er level language

> static analysis tools exist to scan your 10000 dependencies

Maybe this is a dumb question but could you please suggest some of these tools that can scan dependencies?

Most of those dependencies have well defined, stable api. They use or at least try to follow semver. And you're probably only hitting about 10% of your dependencies on the critical path you're using, meaning that a lot of potentially vulnerable code is never executed.

I get the supply chain attacks. I get that you have a tree of untrusted javascript code that you're executing in your app, on install, on build and in runtime. But there's also Snyk and Dependabot which issue you alerts when your dependency tree has published CVEs.

We can talk about alert fatigue, but to be honest, I feel more secure with my node_modules folder than I do with my operating system and plethora of DLLs it loads.

I don't wanna turn this into a whataboutism argument, but at some point you gotta get to work, write some code and depend on some libraries other people have written.

> And you're probably only hitting about 10% of your dependencies on the critical path you're using, meaning that a lot of potentially vulnerable code is never executed.

If a dependency has been compromised it doesn't matter if its code is actually used, since it can include a lifecycle script that's executed at install-time, which was apparently the mechanism for the recent ua-parser-js exploit.

> Snyk and Dependabot

Wait, I’m not safe using “npm audit”?

Semver will not save you.
This is the direct result of the culture of tiny dependencies in JS and some other languages, but not all ecosystems are like this. If you choose to use node, this is where you end up, but it was a choice.

Many languages have a decent standard library which covers most of the bases, so it’s possible to have a very restricted set of dependencies.

Unfortunately for frontend and the Node ecosystem, it's too late to try and put the toothpaste back in the tube.

Hopefully Deno helps with this pain point.

I mean, you say that, but the practice of pulling in so many dependencies is fairly recent. It wasn't even possible for most projects before everyone had fast internet.
> It's a little hard to point the finger at application developers here, IMO.

I disagree. Any application developer who seriously thinks that they only have 10 dependencies if they're only importing directly 10 dependencies should not be an application developer in the first place.

You sure about that? Even if you’re writing just for a vetted distribution of an OS, and you write code with zero explicit dependencies, you still have much more than zero dependencies. It’s turtles all the way down. The key is to have an entire ecosystem that you can, to some degree more or less, trust.
No. we've been shouting warnings for years. There have been dozens, if not hundreds of threads on HN alone warning of supply-chain security threats.

At this point if you're not actively auditing your dependencies, and reducing all of them where you can, then you're on the wrong side of history and going down with the Titanic.

The frank truth is that including a dependency is, and always has been, giving a random person from the internet commit privileges to prod. The fact that "everyone else did it" doesn't make it less stupid.

> The frank truth is that including a dependency is, and always has been, giving a random person from the internet commit privileges to prod

I mean, no. This is hyperbole at best and just wrong at median. A system of relative trust has worked very well for a very long time - Linus doesn’t have root access to all our systems, even if we don’t have to read every line of code.

Linus doesn't have root access to our systems for several reasons. One of them is the fact that we get the actual source code, and not just a compiled blob doing "something". Another is the fact that they have at least some level of reviews wrt who can commit code, although this isn't perfect as the case with the University of Minnesota proved.

Npm on the other hand is much, much worse. Anyone can publish anything they want, and they can point to any random source code repository claiming that this is the source. If we look at how often vulnerable packages are discovered in eg. npm, I'd argue that the current level of trust and quality aren't sustainable, partly due to the potentially huge number of direct and transitive dependencies a project may have.

Unless you start to review the actual component you have no way to verify this, and unlike the Linux kernel there is no promise that anyone has ever reviewed the package you download. You can of course add free tools such as the OWASP Dependency Check, but these will typically lag a bit behind as they rely on published vulnerabilities. Other tools such as the Sonatype Nexus platform is more proactive, but can be expensive.

Maybe this is arguing semantics but unless you run something like Gentoo you will most likely get the linux kernel as a binary blob contained in a package your distribution provides. There isn't really any guarantee that this will actually contain untampered linux kernel sources (and in case of something like RHEL it most likely doesn't because of backports) unless you audit it, which most people won't do (and maybe can't do). So, in princpile at least, this isn't really that much better than the node_modules situation. Security and trust are hard issues and piling on 100s of random js dependencies sure doesn't help but you either build everything yourself or you need to trust somebody at some point.
Linux has all sorts of controls and review policies that NPM doesn't have. It's a false equivalence to say "we trust Linux, so therefore trusting NPM is OK".

If <random maintainer> commits code to their repo, pushes it to npm, and you pull that in to your project (possibly as an indirect dependency), what controls are in place to ensure that that code is not malicious? As far as I can tell, there are none. So how is this not trusting that <random maintainer> with commit-to-prod privileges?

Yeah, this is what I meant, except it goes in all directions. It’s not stating a “false equivalence” because pointing out that you can draw a line between 0 and 100 isn’t stating an equivalence.

Different risk profiles exist. There’s a difference between installing whatever from wherever, installing a relatively well known project but with only one or two Actually Trusted maintainers, and installing a high profile well maintained project with corporate backing.

This is true in Linux land, and it’s true in npm land. You can’t just add whatever repo and apt get to your hearts content. Or, you know, you also can, depending on your tolerance for risk.

And should we all start rolling our own crypto now to avoid dependencies? In most cases a stable library is going to be much more secure than a custom implementation of `x`. Everything has trade-offs. What's stupid is dogma.
I know you're being hyperbolic and I also want to add that for crypto you should just use libsodium. The algos and the code are very good. And lots of very smart folk have given it a lot of review. And its API is very nice.
When you say this, do you mean actual C libsodium? Because surely you don’t mean that I, a js developer, should need to figure out how to wrap this .h file thingy to get it to work in js when there’s SIX third-party libsodium implementations/wrappers/projects sitting right there listed on the libsodium website? /s
This. And npm isn't the only instance. (Appreciate the voice in the wilderness Marcus.)
Or perhaps, the sky is not falling.
> Even if you’re writing just for a vetted distribution of an OS, and you write code with zero explicit dependencies, you still have much more than zero dependencies.

Sure, the entire OS is a dependency. Nothing I said contradicts that. And yes, every application developer should be aware of what they are depending on when they write software for a particular OS.

> The key is to have an entire ecosystem that you can, to some degree more or less, trust.

You don't necessarily need to trust an entire ecosystem, but yes, every dependency you have is a matter of trust on your part; you are trusting the dependency to work the way you need it to work and not to introduce vulnerabilities that you aren't aware of and can't deal with. Which is why you need to be explicitly aware of every dependency you have, not just the ones you directly import.

I am actually not sure if this is possible, while also accepting security updates etc from my OS distributor? How do you literally personally vet every line of code that gets run directly AND indirectly by your application, and still have time to write an application?

I’m okay with saying, “I trust RHEL to be roughly ok, just understand the model and how to use it, and keep my ear to the ground for the experts in case something comes up.”

At the level of npm, I feel roughly the same about React. I don’t trust it quite as much, but I’m also not going to read every code change. I’ll read a CHANGELOG, sure, and spelunk through the code from time to time, but that’s not really the same. I’ll probably check out their direct dependencies the first time, but that’s it.

I actually don’t know how you could call yourself an application developer in most ecosystems and know every single dependency you actually have all the way down, soup to nuts. Heck, there are dependencies that I accept so that my code will run on machines that I have no special knowledge of, not just my own familiar architecture. I accept them because I want to work on the details of my application and have it be useful on more than just my own machine.

Edit for clarity: I agree with almost everything you’re suggesting as sensible. Just not with your conclusion: that you’re not a “real” application developer if you don’t know all of your dependencies

> I am actually not sure if this is possible, while also accepting security updates etc from my OS distributor?

Accepting the OS as a dependency includes the security updates from the OS, sure.

> How do you literally personally vet every line of code

Ah, I see, you think "understanding the dependency" requires vetting every line of code. That's not what I meant. What I meant is, if you use library A, and library A depends on libraries B, C, and D, and those libraries in turn depend on libraries E, F, G, H, I, etc. etc., then you don't just need to be aware that you depend on library A, because that's the only one you're directly importing. You need to be aware of all the dependencies, all the way down. You might not personally vet every line of code in every one of them, but you need to be aware that you're using them and you need to be aware of how trustworthy they are, so you can judge whether it's really worth having them and exposing your application to the risks of using them.

> I’ll probably check out their direct dependencies the first time, but that’s it.

So if they introduce a new dependency, you don't care? You should. That's the kind of thing I'm talking about. Again, you might not go and vet every line of code in the new dependency, but you need to be aware that it's there and how risky it is.

> I actually don’t know how you could call yourself an application developer in most ecosystems and know every single dependency you actually have all the way down, soup to nuts.

If you're developing using open source code, information about what dependencies a given library has is easily discoverable. If you're developing for a proprietary system, things might be different.

I really appreciate your stance, but just have to disagree. If it’s core React, I don’t check beyond what curiosity mandates. If it’s a smaller project with less eyes on it, yes absolutely I’ll work through the dependency chain. But that can also get pretty context dependent, based on where the code is deployed.

But I don’t know how you can make such a strong distinction between “a committed line of code” vs “a dependency”, because the only thing differentiating them is the relative strength of earned trust regarding commits to “stdlib,” commits to “core,” commits to “community adopted,” etc.

It’s too much. There’s a long road of grey between “manually checks every line running on all possible systems where code runs and verifies code against compiled binary” and “just run npm install and yer done!”