Hacker News new | ask | show | jobs
by mr_00ff00 1120 days ago
Curious if any senior devs on HN can comment on the importance/effectiveness of audits for crates?

I’m a junior C++ dev that dabbles with rust in my free time, and I always feel a bit nervous when pulling huge dependency trees with tons of crates into projects.

I would assume most places would turn away from the “node.js” way of doing these things and would just write internal versions of things they need.

Again I am junior, so maybe my worries are way over blown.

7 comments

I think in a lot of C++ and ex-C++ orgs you see this sentiment a lot, and sometimes for good reason. Sometimes that code has security or performance reasons to worry about this. On the other hand, it often doesn't.

On the other hand, Python folks and JavaScript users (which make up a lot of emigres to Rust) probably don't care enough about their supply chain. That's how you end up with misspelled packages causing viruses in production and other disasters.

The short answer to this is that it actually depends a lot on what you are doing.

> That's how you end up with misspelled packages causing viruses in production and other disasters.

For all the stories about malicious packages on PyPI and whatnot: I can't recall ever seeing a story about "misspelled packages caused us problems in production". Most of these packages have downloads in the low-hundreds at best, and I wouldn't be surprised if the vast majority are from the attackers testing it and bots automatically downloading packages for archiving, analysis, etc. I've come to think it's not as much of a big deal as it's sometimes made out to be.

The closest I've seen is the whole event-stream business where the maintainer transferred it to someone else who promptly inserted some crypto-wallet stealing code, but that's a markedly different scenario (and that also seems quite rare; it was over 4 years ago).

> For all the stories about malicious packages on PyPI and whatnot: I can't recall ever seeing a story about "misspelled packages caused us problems in production".

https://medium.com/@alex.birsan/dependency-confusion-4a5d60f...

Discussed at the time: https://news.ycombinator.com/item?id=26087064

That's a different thing; it would (ab)use some package tools' preference of public packages over private ones (at least in some configurations). It's not really a "supply chain issue" but more of a "footgun in some package tools"-issue.
Well they've been subpoenad so probably something happened.
This surprises me that most people that use rust come from python and JavaScript. I would think the reason rust is so popular is from people moving from C and C++ and getting all the nice modern features to do systems with.

Python and JavaScript people I would imagine find rust annoying since it’s all the niceties they are use to but with a bunch of rules on top.

I see people coming to Rust from all angles. It’s a nice sweet spot. I came to it from Haskell and on my team of three the other two devs came to Rust from C++. I can opine the motivation looks something like this:

* From Haskell - looking for a strong type system (with sum types, typeclasses) but is “widely accepted”. No GC.

* From C++ - looking for low-level capabilities (pointers, references) with improved safety. Improved manual memory management.

* From Python/JS - looking for performance with a familiar feeling ecosystem and a welcoming community

I think the Python and JS folks will have the hardest go of it, but they also have the most to gain.

I went Python → Rust. Rust is high-level enough for me to remain competitively productive with Python, and the type-system just helps so much. I can't tell you how nice it has been to not have to see "Object of type NoneType has no attribute 'foo'" anymore. Also,

  def foo(obj):
    # ...
And me wondering "what is 'obj'?" and then pulling that root … and pulling … and pulling …. Also "data", another just fabulous name for a variable, really narrows the possibilities. Even once you know the "type" of the variable, oftentimes I'd find that the type definition would subtle shift and morph in different parts of the program: they'd all want a Duck, but have varying opinions on what a Duck actually is. You cannot commit such BS with an actual type-checker.

There is more up-front work with the compiler, but it pays off: the code that passes the compiler is of much better quality.

Also, Option<T> is very nice to have when you need an actual Option<T> (i.e., generically), which Python lacks. (No, `None` is not it: You run into problems when T == Option<U>, and you have None and Some(None) — Python's cannot differentiate between the two.) Also sum types in general.

Right but you essentially just mentioned strong typing. There are a billion languages that don’t run into those issues.

Java, Go, C, Scala, Haskell, etc all fix those type issues.

Sure, I suppose, but sum types (which I mention) eliminate all but Scala and Haskell from that list.
Python and Javascript programmers so massively outnumber all other programmers that they are the majority of people converting to any language.
The "node.js" way of doing things, and it's dysfunction, is nearly exclusive to node because Javascript lacks a standard library and npm's haphazard way of running things. Java, Ruby, Python, even my grandfather's Perl have had "modules" for years with none of the fear that is typically associated with Node.

Personally, C++ aversion to sane dependency management is more about C++'s "I know better than you" culture and legacy cruft (packages are usually managed by the distro, not the language) than actually having any serious security implications.

This is slowly changing wiht conan and vcpkg increasing adoption.

Still most environments I worked on, always had internal repos for packages, no CI/CD server talks to the outside world and vendoring isn't allowed.

in a way rust's standard library is close to node's than python's. You can't really do much without getting some crates in.
> I would assume most places would turn away from the “node.js” way of doing these things and would just write internal versions of things they need.

Incorrect assumption, look up the left pad fiasco [1]. Its importance is really a personal opinion; convince nearly always trumps security so if the NPM way allows you to increase sales by ~10% you'll see people continuing to do it.

Google is fairly principled though, all of the 3p code is internally vendored and supposed to be audited by the people pulling in that code/update.

[1]: https://www.google.com/search?q=leftpad+broke+the+internet

Writing your own version of everything means it's probably more tuned to your needs. But unless it's a core part of your software it will also be worse because you can't justify putting many resources into it. It also means new hires will have to learn a lot more. It's one of the (many) reasons why it's so hard to onboard into C/C++ projects, because every standard building block is bespoke and somehow different than what everyone else does. Of course if you are really big you just have those resources, which is why Meta or Google can have bespoke everything.

On security it's a tradeoff. The open-source version is an easier target for attackers, but might be much more battle-tested and thus more bug-free. Audits are the attempt to have the best of both worlds here, and since they again can be crowd-sourced (with cargo-vet and cargo-cev both working on this) it scales even for companies that aren't Google-sized.

I've reviewed hundreds of Rust crates. It's tedious and boring. The results are boring too — their code is mostly good! Big dependency trees have a reputation for being hot garbage, but that's not my experience. In Rust the small focused crates tend to do one thing, and do it well.
> I would assume most places would turn away from the “node.js” way of doing these things and would just write internal versions of things they need.

I assume most places don't care.

Dependencies are dependencies in rust as in C++. I found it's extremely rare that homegrown library that have similar functionality to (used) open-source libraries are better from a security stand-point.

At least in Rust a large part of the security issues that would be VERY time consuming to audit at scale through your dependency tree (whether internal or public) are covered by the compiler/borrow checker/type-system.

In that sense I would take on an larger amount of dependency in Rust than I would in C++ while sleeping better.