Ask HN: Does anyone investigate open source packages before using in prod?

Y	Hacker News new \| ask \| show \| jobs

	Ask HN: Does anyone investigate open source packages before using in prod?
	14 points by lyttlerock 2186 days ago
	I'm curious to hear if anyone else does any due diligence before using open source packages in production? Not anything major - just checking for recent commits / activity, issue logs, etc.

12 comments

alltakendamned 2185 days ago

It's interesting to see so many people here checking the code of all their open source packages, so here's my take on it as a security consultant:

No, most people don't, they even have a hard time keeping library versions up to date.

link

jfoster 2185 days ago

Do you have a pragmatic approach that you typically recommend, considering that (without version pinning thousands of packages) anything could change from one day to the next? (even if a package was "good" today, it could turn "bad" tomorrow)

The best I've been able to come up with is to pick things that have minimal dependencies of their own. It doesn't eliminate the threat, but it does at least reduce it.

link

alltakendamned 2184 days ago

No it’s not easy to realize. From a security perspective the idea is to always run the latest. Breaking backwards compatibility becomes a more difficult proposition. In reality you need to have an engineer test or analyze the updates. Some mature libraries maintain backwards compatibility (eg openssl) but if using something like npm it becomes almost impossible.

link

cpach 2185 days ago

IIRC there is some kind of SaaS solution for this but right now I can’t recall its name. Maybe someone else here knows?

Edited: I reached out to some security people and it seems like the following are popular tools for this use case: Snyk / Dependabot / Whitesource.

link

gitgud 2185 days ago

Github have a security check that scans your repo's dependencies and warns you about vulnerabilities too

link

stevekemp 2184 days ago

I used to audit opensource code for security issues, on a regular basis, and even now before I install a public-facing application I generally have a look at the code.

It's not often I spot anything major, but I figure if I have the time I should do it just in case.

I often look at the code for PHP-extensions, npm-libraries, and similar that colleagues introduce. Just to be sure there's not anything blatently horrid going on.

link

austincheney 2186 days ago

Yes.

If you work in a secure environment or support critical infrastructure there are teams whose sole purpose is to approve/deny releasing software regardless of who wrote it. Such teams will typically require source code, written justification, senior management signed approval, and test validation. In the case where source code is not provided, such as closed source commercial software, the vendor will be required to accept liability for all losses due to their software as ratified by a signed contract.

link

WA9ACE 2186 days ago

I normally read a good chunk, if not all of the code of a dependency before I add it to my projects except in the case of community standard things (in Ruby) such as ActiveSupport or Sequel. Going over a prospective dependency a few months ago bore fruit in proving why you should always do this. NewsAPI is a neat little API for fetching news whose docs just so happen to show a ruby gem. Being the lazy developer I am I’d like to use the gem than build another API client, but before I did that I read the source as one should. Low and behold what do I find but the evil eval in the code for a dirt simple API client. No thanks.

https://github.com/olegmikhnovich/News-API-ruby/blob/master/...

link

amoitnga 2183 days ago

Is this malicious? I'm honestly curious, as I don't have much experience still in the field. my answer to OPs question is NO , but I'd like to grow.

link

jfoster 2185 days ago

This article might be of interest:

https://medium.com/hackernoon/im-harvesting-credit-card-numb...

link

nikitaga 2185 days ago

I am paranoid about security of all those packages, so yes, even before just downloading, I check the authors, activity and read the source code. Not always – e.g. I skip the source code if it's something big AND very reputable AND I decided that I need it such as scala/scala or facebook/react – but I do my best.

It's very annoying, it's not free, and it affects what kinds of libraries I use. My projects have fewer and smaller dependencies than typical because of these self imposed constraints.

On the upside, borrowing a pattern or a dozen lines of code instead of pulling a dependency that will remain 90% unused is really underrated. As is understanding how things work under the hood.

link

jfoster 2185 days ago

React itself is big & reputable, but the dependency tree is massive and I doubt that it's getting fully vetted on an ongoing basis. Even if you vet it today, any given dependency can be updated to something else tomorrow.

There are definitely some things in React's dependency tree that are a bit questionable if you are sensitive enough to any given problem, beyond just security. For example, packages where the license being used is contradictory between the package.json vs the LICENSE file or the full license terms are not expressed within these but are clarified in the README.md.

link

overkalix 2185 days ago

> My projects have fewer and smaller dependencies than typical

Taking a look at your at your github projects and build.sbt... this is quite an understatement.

link

nikitaga 2183 days ago

Haha well tbh I just didn't feel the need for more deps in such libraries. It's a tougher choice in application code!

link

uvw 2186 days ago

I would be surprised if anyone has enough resources or willingness to do that for every open source package they are using. For companies that go through auditing, they can CTA by relying on products like Nexus IQ.

link

carapace 2186 days ago

Yes, in depth. Not just the packages but their dependencies as well.

link

gitgud 2186 days ago

Really? What about all the dependencies from those dependencies?...

For example, our company is working on an app that has 82 npm dependencies and over 17,000 resolved npm packages...

It's absolutely ridiculous to investigate all of them... but it's also necessary if you want to be sure...

link

carapace 2185 days ago

> Really? What about all the dependencies from those dependencies?...

Yep, all the way to the end.

I got the idea from a book called "Hollywood Secrets of Project Management Success" by James R. Persse. It's two books interleaved really, one is just a standard pitch for Agile methods (IMO), but the other is a presentation of the process that large film studios use to make movies. The movie industry is ~100 years old and mostly very good at bringing in projects on time and under budget.

Somewhere in there he talks about how they'll track their dependencies in a kind of "portfolio", I forget the details, but it translates in IT to a "dependency portfolio" and you would (if you're large enough) have an actual "Deps Dept." and a Deps Manager whose sole job is tracking dependencies and their updates and patches, etc.

> working on an app that has 82 npm dependencies

Ach! Well, see, there's your problem right there. :-)

Seriously though, one of the benefits of a dependency portfolio is to help you know when your system has gotten out of hand. The problems are still there even if you don't look at them, eh?

> It's absolutely ridiculous to investigate all of them... but it's also necessary if you want to be sure...

Ya feel. ;-)

link

gitgud 2185 days ago

Thanks for the response, that's an interesting way to deal with it. How do you verify a dependency? Do you literally examine the source code? Make sure the build is reproduced? or just the meta data? (downloads, stars) has the portfolio actually prevented any vulnerabilities?

It's pretty common for JS projects to have thousands of transitive dependencies, I'm not sure keeping a private portfolio is much use. The entire open-source ecosystem is built on the foundation of trust, if I use a package that's being used by 500 other packages, I can have a high degree of certainty that the package is safe, and by locking the dependencies with yarn.lock I can prevent sneaky updates from entering the system.

Anyway maybe I'll look into the dependency portfolio, see how it goes.

link

carapace 2184 days ago

Cheers!

> How do you verify a dependency? Do you literally examine the source code?

Yeah. It's part of the overhead of using the software. You also look at the history of bugs and how they were handled.

> It's pretty common for JS projects to have thousands of transitive dependencies

Yeah, I know, and it's bonkers IMO.

> The entire open-source ecosystem is built on the foundation of trust

In practice, yes, but in theory, no. The whole idea is that you get to see the code you're running, because the guys who wrote it are clowns. Free Software started when RMS wanted to fix his printer and Xerox said, "No."

> if I use a package that's being used by 500 other packages, I can have a high degree of certainty that the package is safe

I think history has shown that that reasoning is at best probabilistic, eh? You're gambling.

Now, of course, there are limits. Some things get a pass. Do we audit the source of the bash shell? No, despite the fact that it's maintained by a single volunteer.

> Anyway maybe I'll look into the dependency portfolio, see how it goes.

Check out that "Hollywood Secrects" book I mentioned.

link

amoitnga 2183 days ago

Thanks for sharing. It would be very interesting to know some of the examples when it payed off. Could you please share?

link

seanwilson 2185 days ago

How long does it take you to do this for e.g. all the dependencies for a bare bones Angular, Vue or React app?

link

carapace 2185 days ago

I couldn't begin to estimate. (We don't use those.)

link

seanwilson 2185 days ago

I just don't see how anyone could realistically look at all the lines of code that any nontrivial JavaScript app relies on in any depth.

I'm sure most people don't review the code for their operating system, drivers, web server, compiler, browser etc. but they do assess if the entities that write + support them are worth trusting. This is likely the only realistic approach for complex JavaScript apps also.

link

carapace 2184 days ago

> I just don't see how anyone could realistically look at all the lines of code that any nontrivial JavaScript app relies on in any depth.

Right. And that's really bad.

> I'm sure most people don't review the code for their operating system, drivers, web server, compiler, browser etc.

Right, but some people do. Hire one of them. (And if your "props dept." can't keep up with the changes to all the things that's also really bad.)

> they do assess if the entities that write + support them are worth trusting.

No one is a magic code elf. (Some people come close. Fabrice Bellard might count. But even that worthy commits bugs.)

Like I said in a sib comment, yeah, some things get a pass. Bash shell for example. Then again, remember e.g. "heartbleed"?

link

seanwilson 2184 days ago

> > I just don't see how anyone could realistically look at all the lines of code that any nontrivial JavaScript app relies on in any depth.

> Right. And that's really bad.

> > I'm sure most people don't review the code for their operating system, drivers, web server, compiler, browser etc.

> Right, but some people do. Hire one of them.

The interesting question isn't if you can do it, it's when should you, to what extent, and how much it will cost.

"Always do it, do it in-depth, the time consumed isn't important and the budget isn't important" is a bad approach for example and isn't helpful to the OP.

Successful software development is all about making appropriate tradeoffs - you're not going to get very far by conducting your own OpenSSL audit when all you want to do is write a todo web app.

link

jfoster 2185 days ago

On what cadence do you do this? Every release?

link

carapace 2185 days ago

The "deps portfolio" gets updated whenever the deps change. In practice the flow goes like this:

0. A dev wants to use a new dependency, likely after experimenting with it a little bit.

1. Preliminary evaluation, which includes a transitive dependency scan. ("Too many dependencies" is a valid fail condition all on it's own.)

2. If everything looks good we bring it and it's deps into our internal repo. This includes the plumbing to add it to our dev|test|production envs. (Using Docker or whatever.)

3. Now the devs can use it in code destined for prod. There's a nice page in the company wiki that lists the exact version(s) with links to the docs, bug trackers, mailing lists, etc. and also the internal company lore for that package.

It's tight.

- - - -

This might seem like a lot of work up front, but think about all the work it saves down the line.

link

jfoster 2185 days ago

When one of the transitive dependencies fixes a security issue, is it then re-evaluated prior to being updated in the internal repo?

I'm guessing you work at a pretty large tech company. It seems wasteful that so many companies might be replicating this work. I wonder if there might be the opportunity for a body to review & approve packages on behalf of many companies. Perhaps npm will eventually move in this direction.

link

carapace 2185 days ago

> When one of the transitive dependencies fixes a security issue, is it then re-evaluated prior to being updated in the internal repo?

Yes, but this is typically pretty low overhead. And when it's not, it usually means there is some issue that has to be addressed anyway.

> I'm guessing you work at a pretty large tech company.

I did once, but right now "we" is a tiny startup (we're using Elm and Erlang.)

> I wonder if there might be the opportunity for a body to review & approve packages on behalf of many companies.

Ideally, that's what Free/Open Source Software would be, eh?

In the old days there were "sysadmins", System Administrators, who handled a lot of this sort of thing.

link

bjourne 2185 days ago

Doesn't everyone? That's one of the annoying parts of using other people's code. You have no idea how good or bad it is until you have thoroughly vetted it.

link

bnchrch 2185 days ago

These comments feel skewed. I look for activity and support. Reading the actual code is typically far from my mind.

The time-opportunity cost isnt worth it on average

link

amoitnga 2183 days ago

In my case no. But I tend to use the big guns. I'm willing to learn though

link

bluGill 2186 days ago

Yes, if there are no commits at all I know I'm stuck maintaining it.

link

sigjuice 2185 days ago