Hacker News new | ask | show | jobs
by ddevault 1967 days ago
Even if we run the same math with 7 out of 10 bugs being memory safety related, and assuming that Rust prevents all of them, those same example programs end up with 30 bugs in Rust and 10 bugs in C.

There's another argument I could make, too. Look at the bug tracker for the program you want to rewrite in Rust, examining the historical bugs. You'll find that there are often hundreds or thousands of mistakes that they made and already fixed in the original codebase. If you're rewriting it from scratch, can you be sure you won't make just as many? A stable, maintained codebase with a low throughput of changes tends to have fewer bugs over time, as the lack of churn avoids introducing new bugs and the application of time susses out all of the existing bugs. Rewriting the whole thing from scratch has a very high rate of churn, introducing a whole new slew of bugs on its own.

Now, a small codebase, focused on delivering its key value-adds without distractions, kept stable and at a low-churn rate over a long period of time: no matter what language you use, this is the best recipe for reliability and security.

4 comments

So again why does it have to be "rewrite at 1/10th the complexity in <Language A>" (10%) vs "rewrite in <Language B> at full complexity" (30%)? What's preventing using Language B for the complexity rewrite and getting 0.1 * (1 - 0.7) = 3%?

Rewrites do bring the chance to Royally Screw it Up™ so it's certainly not simply a product of "it is now written in <Language X> therefore safe" but as it said not only have projects shown the security didn't fall apart but they have shown the opposite.

I agree you don't get there by a bunch of yolo rewrites to whatever is hip though, it has to be a planned effort that isn't rushed. Much in the same way quickly writing a small replacement utility does not inherently make it more secure or reliable than an existing significantly more complex utility. Even just trying to shave some functionality off the existing code is rife with "but how does removing this piece affect the app remaining logic" and takes time and effort to do right.

Both methods do have to be done right and both do greatly help security but there is nothing about picking a memory safe language or making a significantly narrower focused utility that preclude each other.

You can do both! But because simplicity has a substantially greater impact than the language choice, I think it's better to focus on that. Right now, the ecosystem is focusing more on the language choice, and hardly talking about simplicity at all. And particularly in the case of Rust, I think it fails simplicity a lot in its own ways - in the stdlib, the compiler and toolchain, the language design - and the trade-offs don't really make sense for a lot of use cases that people are pining for it over anyway.
helps that a 10kloc c program getting riir'd probably won't be a 10kloc rust program, because c doesn't have libraries and rust does.

it is literally impossible to write "a small codebase focused on its key value-adds without distractions" in a language that doesn't have strings and requires you to build a dictionary from scratch

>helps that a 10kloc c program getting riir'd probably won't be a 10kloc rust program, because c doesn't have libraries and rust does.

What? Rust has so few libraries of significance that it still depends on C for security-critical areas like SSL.

>it is literally impossible to write "a small codebase focused on its key value-adds without distractions" in a language that doesn't have strings and requires you to build a dictionary from scratch

Strings are misunderstood, I'm not going to get into it here. My dictionaries in C usually clock in at about two dozen lines of code. The complexity doesn't go away because your language does it for you.

Having written dictionary implementations in C, I would be very interested in seeing your implementation that fits in two dozen lines of code.
Threw together an example (untested, with obvious errors) to give you an idea of what it could look like:

https://paste.sr.ht/~sircmpwn/3122d4a27a8e5312462e2329bf7ed6...

Actually managed to get it to exactly 2 dozen lines of code, not including the header, which isn't bad for an off-the-cuff remark.

You'd naturally expand or shrink this with whatever subset of map functions you require, like key/value enumeration, object deletion, resizing, whatever. It depends on your use-case. I don't believe in generic code.

Ok, that makes more sense. I was considering a slightly more fully-featured table and including the header (see: https://gist.github.com/saagarjha/00faa1963023206a8ccd987798...) and I was a couple times larger than your number, so I was trying to figure out what you were doing that I was unable to replicate…
that's not true these days, rustls is a great TLS lib that has been through at least one serious external security audit.

https://cure53.de/pentest-report_rustls.pdf

For what it's worth, rustls relies on ring which has primitives written in C and ASM because getting constant-time operation guarantees from Rust is Very Hard. Though progress is being made on this area.
Except that Rust is also a much much more expressive language. Even ignoring things like solid module support and libraries you'll find your Rust programs to be much fewer LoC (assuming bug/LoC is the right metric) for equivalent functionality.

I agree that rewrites have the serious potential to introduce new bugs and the cost is rarely worth it if the codebase is actually that stable and low througput, but the reality is that most aren't. A one time high cost in exchange for introducing 70% less bugs over a period of N years starts to look like a good trade off.

Yes, complexity is the root of all evil. I can get onboard with the whole statement except the "no matter what language you use". If you have the ability to use any language that enforces memory safety, we should use it.

Lines of code is a poor approximation for complexity. Rust programs are shorter, but they are not less complex. The AST is similar and the graph of relationships between different parts of the code is much more complex than in C. Overall I'd say it balances at best, if not that Rust is more complex.
The sudo code in question is typical C: string processing with pointers and hand-rolled byte manipulation, size calculations, manual buffer allocation and freeing, and so on. The Rust equivalents of all this are far simpler.
Lines of code is great approximation for complexity, or at least how many bugs you're writing: https://softwareengineering.stackexchange.com/questions/1856...
Perhaps indeed! But a crucial distinction is that I consider the complexity in the langauge, compiler, and standard library to all be influences on your program's total complexity as well. Using std::List (or whatever you call it) has the same total complexity as writing your own little growable array.
From the point of view of bugginess, complexity in the implementations of massively popular libraries is far less of an issue than code you just wrote yourself, because the code in those libraries will have received much more testing than the code you just wrote yourself. So it doesn't really make sense to just add the complexity of components up like that.
sudo is quite a popular utility, by the same logic it might be expected to be well tested...
> "Even if we run the same math with 7 out of 10 bugs being memory safety related, and assuming that Rust prevents all of them, those same example programs end up with 30 bugs in Rust and 10 bugs in C."

Maybe but not necessarily; it's reasonable to assume that Microsoft and FaceBook put non-zero effort into designing around, programming around, testing, looking for and fixing memory safety related issues in their C code. It could be the case that not having to care so much about those frees up some non-trivial amount of attention and time which could be spent on the other classes of problems.

Similarly, it is possible that they would use the time to add new features with new bugs. I'd personally suspect that to be the more common outcome.