Hacker News new | ask | show | jobs
Can We Trust the Libraries We Use? (bytes.com)
36 points by v4us 4114 days ago
6 comments

I think this is a very interesting and important question and is very similar to one that I've considered a number of times without any real answer.

My question boils down to "can we trust compound knowledge". I'm defining compound knowledge as knowledge that builds on other knowledge. It's kind of like proofs in math, once you have learned a proof you often learn a shortcut for that proof that might not make a ton of sense at a glance and almost like "magic". Similar to using libraries that "just work", someone already figured out the hard stuff and they've abstracted it so you can easily use the concepts at a high level.

Things like frameworks are amazing tools and give 1000's (and often many more) man-hours of work for free. However just like math proofs if you don't understand what it's doing under the hood you can get tripped up when you try to do something that "looks" like it should work but actually doesn't.

I'm not advocating that we all must know what every library we use does under the hood, for most higher-level languages JS/PHP/Python/Ruby/etc this would be impossible (not just knowing what, for example, a php library does with PHP but what's happening in C to make that happen, you can take that even further and go down to machine code). We have to make some assumptions but I don't think/know if there is a hard line on how low we need to go.

Personally I stay as high level as I can until there is an issue then I will dig deeper, I'm sure there is a whole spectrum of developers ranging from ones like me to ones that read every line of a library before they include it in their project.

I don't have the answers, like I said this is something I think about a lot. Take a look at the NodeJS/npm ecosystem, you can easily see when you run 'npm install' that you are installing Library A that makes use of Library B and C which makes use of Library D and E.... So on and so forth. It's one of the best things about OS software development, you can build on what already exists instead of always re-inventing the wheel. But that comes at the cost of having to make assumptions about the code you are using. I always worry that I'm building a house of cards that will come tumbling down and even worse if it all comes tumbling down in a library I use so it's not as easy for me to fix it directly.

Again, I don't have any answers, this is just me rambling about a concept I bat around in my head all the time.

Compound knowledge works in maths, because of the solidity of proofs. You don't wake up one day and find that you need to update all your existing proofs to Axiom Of Choice 11.3.5 in order to fix a critical security bug.

Programming has weak versions of this: if all your libraries typecheck, then you can write a typecheckable program with them.

Functional languages get a lot closer to reliable composition, but things that modify state are a lot harder to reason about because the state space just explodes. And you can of course write in Coq or do formal verification in Z - it's just much more time-consuming. As an industry we'd rather get to market first than get the product 100% reliable.

> Compound knowledge works in maths, because of the solidity of proofs

Yes, I didn't mean to compare them 1 to 1 but more of in concept.

> As an industry we'd rather get to market first than get the product 100% reliable.

Very true and as long as security isn't core focus of your app I think this is just fine.

Absolutely, no, but the interesting question is whether we can trust them as much as or more than stuff we roll ourselves.
No, we can't. And no, we don't have a choice to trust them anyway.

And what's the alternative? Write everything yourself? Besides investing a lot more work and time for the same functionality of your end-user software, you're still stuck with code that may be broken, contains bugs or security problems.

One of the interesting parts of this is the distinction between different types of applications: safety critical and otherwise. If, as other commenters are saying, no we can't trust libraries, but yes we have to anyways, then a library may be good enough for non-safety critical applications and get all the adoption and work. Then you come along with your safety critical application for which this library isn't trustworthy enough (by assumption). What are you to do here?

I'd love to see a bigger ecosystem of safety critical code floating around. Libraries in it would be less full-featured and optimized than their less trustworthy counterparts with wider adoption, but it's a niche I really hope to see develop as code takes over the world.

To the extent that we can trust our testing methods of those libraries.
Where I work, in .NET we use various static analysis tools: StyleCop, CodeAnalysis and CodeContract. Warnings fail the build. And we are using a statically typed language of course. Oh... and gated check-ins.

It takes a few more minutes to write some code that build but it sure avoids a lot of silly bugs. Code contracts especially so!