Hacker News new | ask | show | jobs
by jacquesm 3368 days ago
No, he's spot on. The one thing that keeps happening over-and-over again is that people who have never put together a system of any complexity will yell from the sidelines how others should build their stuff.

This goes for operating systems, application software and advanced rocketry alike.

The reason we have so much C code powering the world is that (1) there wasn't anything better when that code was written, (2) time has weeded out most of the bad bugs to the point that the old codebase will be substantially more stable than anything new no matter what language you would write it in.

If someone is willing to throw a few hundred million or so at re-building an operating system from the ground up that is 100% safe and reliable and that does not have more bugs than what is available right now then they are free to do so. They're also free to donate their own time and to try to enlist their buddies. But they're not free to yell at others how those others should do their jobs.

Show don't tell.

6 comments

> If someone is willing to throw a few hundred million or so at re-building an operating system from the ground up that is 100% safe and reliable and that does not have more bugs than what is available right now then they are free to do so.

But at the same time - we are spending those billions. We keep throwing billions at Apple by buying their new phones, yet they release lists of security issues which this time were basically all:

  - malicious crafted image -> remote code execution

  - maliciously crafted font -> remote code execution

  - maliciously crafted audio -> remote code execution
I'm not sure if I have to be able to write a font parser and rendering subsystem in a memory safe language myself to be allowed to complain that Apple is using my money for something other than doing just that. To me it's completely mind blowing that someone uses a C library to parse complex third party binary data from the internet.
If c lib is mind-blowing, what about kernel and it's extensions?

For now c is the only thing we have that can be used so widely. Some day we may be able to use rust or something else, but that's future, not present.

Any lang that easily calls C and can be called by C, or can be compiled to C should be fine to use today.

Re: kernel - it's a lot of pretty simple subsystems. Each system is usually simpler (easier to verify and polish) than a lot of common libs doing really complex things (ssl, video codecs, ttf, ...). Complexity + raw pointers quickly add up to guaranteed problems.

Rust uses llvm. Use it for new development on any platform that llvm supports. A few microcontrollers might be left out.

Rust can speak C's ABI, so you can write kernel extensions in Rust. There's some legwork involved, but it's quite possible.
Interesting.

So you're saying that Rust, at the moment, in theory, can be used in any place where C is required? Ie. writing kext on macOS, loadable kernel module on Linux or things like extension for SQLite are all possible?

In that case I wonder if there are any theoretical blockers for things like (using no_std and) mapping internally used C structures (let's say BSD kernel's rbtree) into typed, first-class Rust citizens - to avoid std overhead and reuse what's already available? In other words - is there anything there in C macros that can't effectively be expressed in Rust for example?

That's correct.

There are no theoretical blockers; tooling could make it a lot easier though. I was hand-porting chunks of the Ruby interpreter, which uses C macros extensively, and it's doable, but not fun.

>The reason we have so much C code powering the world is that (1) there wasn't anything better when that code was written...

Right, but the criticism he received for his first post was not primarily about choosing the wrong language or about refusing to rewrite in a different language now. It was primarily about downplaying the role of C in the security issues that were found.

The rest of the debate, at least as I read it, was not so much criticism as it was a generic debate about the pros and cons of various programming languages like we're having one every other day on here and on reddit.

Choosing C back then was clearly justified for a project like curl given all the constraints and considerations. And for a rewrite to be a net positive in terms of security probably takes years, not to mention the huge effort that someone would have to put into it.

I have put together systems of considerable complexity (>100k lines), and am working professional programmer for almost 2 decades. Most of my work does not involve C, but various "safe" languages.

There were better alternatives already available at the beginning of the 70s, and certainly when most of the software was written that we are using today. Just to name a few I am familiar of with:

- Pascal being as old as C, and later on

- Modula 2, Oberon, the Oberon System was completely implemented in Oberon

- Eiffel

and of course, today:

- Rust

- Go

So, even back then, there were more safe alternatives to C around. C gained more popularity, and consequently the tooling gave it some edge, but there would have been alternatives. Of course, when you have a nice debugged C program, that is something worth keeping. And as a consequence, for mostly historic reasons, a lot of our current infrastructure is built on top of C - but that alone does not mean that C was the perfect or even the right tool to build all of this upon.

> (2) time has weeded out most of the bad bugs to the point that the old codebase will be substantially more stable than anything new no matter what language you would write it in.

This seems to assume that the codebase is stable and in maintainance mode. However, the reality for the vast majority of open-source projects is that they're either being actively developed, adding new code and new bugs, or they're abandoned.

I wouldn't put it as strongly--I think valid criticism can come from someone who hasn't "put something together"--but at the same time I think a lot of the criticism in this area isn't all that valid. Some of it is, but mostly it seems to me just people parroting "c is unsafe" as though that is itself the end of the discussion.
Exactly, show don't tell. And when I point to the fact that the Rust codebase itself has a lot of "unsafe" places I get the explanation "yes, but it's an old code and we haven't had the time to rewrite it to use the newer features that would allow more safe code."

Think about it: the people who design Rust don't have the time to rewrite their own Rust code to be safe.

When there's no time to rewrite the code of Rust to be demonstrably safe, why should any other project have more time than that, where the programmers know Rust much less or not at all?

Maintaining the successful project is hard and takes a lot of time, even without switching to something new.

Second point: look at the Benchmark game, a fast Rust code example also typically has an "unsafe" code. So it's still not even demonstrated that "you can have it both fast and safe" holds for these examples. Fast as in "as fast as C" and "safe" as in "provably safer than C."

Finally, I believe that the language should simply implement a language-level no-cost check for integer overflow, as the machine languages already have, and not complain that there aren't some special mechanisms in CPUs to do it differently. It's your language's failure if you can't use what already exists. The CPUs already can do it, the high level languages don't provide the high-level access to it. Which doesn't matter for C which is "unsafe" by definition (and where it's natural to call some asm when needed) but it should for Rust which goal is to produce a more provably "safe" code.

Writing unsafe Rust instead of unsafe C is not giving you any real advantage for the existing projects, as long as the state is as it is.

Also, look how Rust uses TLS. Is safely rewriting a crypto library which is useful in real project still a too big task to be taken?

I like the perspectives of what Rust could bring. But I'd like to see a real life case of its success too.

> Writing unsafe Rust instead of unsafe C is not giving you any real advantage for the existing projects, as long as the state is as it is.

Your argument hinges pretty heavily on this. It's incorrect. The reality is that unsafe code in rust is not like unsafe code in C. First, it's bounded to a module. Second, you can grep for it (huge). Third, you still have less UB in unsafe rust than in C.

I've never written code in rust that required unsafe, after many thousands of lines.

> Is safely rewriting a crypto library which is useful in real project still a too big task to be taken?

https://github.com/ctz/rustls

?????????????

> But I'd like to see a real life case of its success too.

Ripgrep is a nice one to point to. There's redox too. Tons of rust code out there.

> It's incorrect.

> I've never written code in rust that required unsafe, after many thousands of lines.

That illustrates the level of understanding of those who argue that curl should be rewritten in rust. Bravo.

Why, specifically?
It's self evident. One personal and very small example (many thousands is obviously less than 10 thousand) doesn't prove anything.

curl has an order of 120000 non-empty lines in c and h files.

The person who made statement of "many thousand" probably made toy applications, compared to something like curl.

More realistic would be to compare something that should have been made in curl: like the "ring" library you mentioned as a Rust answer to "openssl." Which is neither an openssl nor written in Rust.

It is far from providing the functionality of openssl. The author states that:

"ring exposes a Rust API and is written in a hybrid of Rust, C, and assembly language."

"ring is focused on general-purpose cryptography. WebPKI X.509 certificate validation is done in the webpki project, which is built on top of ring. Also, multiple groups are working on implementations of cryptographic protocols like TLS, SSH, and DNSSEC on top of ring."

Actually a wrapper, and actually much smaller than openssl.

Second, "ring" has just some 9000 lines in rs files but 13000 lines of C (as it is defined by the author to be a wrapper, not surprising). So it is still just a wrapper around the C, still orders of magnitude smaller than curl, and still unable to be actually written in Rust.

On that level, I guess the Rust people would call curl the Rust application as soon as a few .rs files would appear in curl code base. Or not.

It's simply has no sense, supporting Rust with misinformation instead of being honest about its current limitations regarding what's actually implemented and what actually doesn't use "unsafe."

It's obvious that those who suggest to others to rewrite the big projects in Rust should first show that they managed to make some much smaller safe code. Especially that they managed to make some meaningful full-Rust and safe libraries.

See, this is much better of a comment than the above, which was mostly an insult, with no substance.

(I don't agree with much of this but that's not the point; my point was trying to raise the level of discourse around here.)

> when I point to the fact that the Rust codebase itself has a lot of "unsafe" places

One of the criteria for being in the standard library is "is this something which needs a lot of unsafe." This is so it can receive extra auditing, etc.

> yes, but it's an old code

This may have been true a while ago, but in my understanding (I'm not on the library team) it isn't really any more.

> why should any other project have more time than that

Because they do not need to write their own unsafe code, since it's already done in the standard library.

Generally speaking, applications in Rust should almost never need unsafe. Libraries may.

> a fast Rust code example also typically has an "unsafe" code.

This is not really true anymore; in fact, recently some programs were contributed that are 100% safe yet are faster than their previous, unsafe-using versions.

> believe that the language should simply implement a language-level no-cost check for integer overflow

This is not viable. Not enough machines have this.

> Writing unsafe Rust instead of unsafe C is not giving you any real advantage for the existing projects, as long as the state is as it is.

Unsafe Rust still has many, many more safety checks than C.

> Also, look how Rust uses TLS. Is safely rewriting a crypto library which is useful in real project still a too big task to be taken?

https://crates.io/crates/ring ?

> This is not viable. Not enough machines have this.

Of course they have. It's a basic set of machine code instructions. Can you specify which doesn't please?

> One of the criteria for being in the standard library is "is this something which needs a lot of unsafe."

Do you honestly think a project like curl has everything it needs "which needs a lot of unsafe" already implemented in the current standard libraries?

Why don't you rewrite the curl in Rust then, it's just using what's already written? Let me know how far you come.

> https://crates.io/crates/ring ?

Does it have all the features that curl needs, or does curl still have to use a Rust wrapper around openssl, inheriting all the potential unsafety problems of openssl anyway? What is worth then, in percentage, having anyway less exposed part of the code in Rust? How many other wrappers curl has to use? Have you tried to check and estimate if it's still worth?

Once again, Rust people should first demonstrate that they can manage to make any relevant library "cleanly" safe, then complete the feature set, and only then expect big non-trivial programs consider using some Rust.

First make the basics safe, then let us use that basics. Once you have save openssl equivalent, even C programs can link to it and be much less exposed to the potential problems than they are now. It's that direction that only has sense.

> Of course they have. It's a basic set of machine code instructions. Can you specify which doesn't please?

That they exist is not the problem. That they exist _and_ don't provide enough performance is the issue. That is, your "no cost" is what I'm asserting here, not that it doesn't exist at all.

(This is not my area of specialty but was brought up when discussing this behavior; it's measured as a 20%-100% hit to speed, which is unacceptable.)

> Do you honestly think a project like curl has everything it needs "which needs a lot of unsafe" already implemented in the current standard libraries?

I don't think a project like curl needs much (if any) unsafe.

> Why don't you rewrite the curl in Rust then, it's just using what's already written? Let me know how far you come.

I did not suggest that curl should be re-written in Rust. I do think a functional equivalent in Rust would be good to have. I do not have the time to write such a thing myself. Luckily, many others have been working on this kind of thing.

> does curl still have to use a Rust wrapper around openssl,

ring is a port of BoringSSL to Rust, piecemeal. It still has some C code today, but at the end, will be all Rust/asm. Given that BoringSSL is a fork of OpenSSL, I would imagine that it would have enough functionality.

> That they exist is not the problem. That they exist _and_ don't provide enough performance is the issue. That is, your "no cost" is what I'm asserting here, not that it doesn't exist at all.

I'm claiming that the machine instructions exist to implement at no cost this high level construct (using C-like syntax):

    if ( overflows( c = a + b ) ) {
    }
What people say to produce slowdowns is if they want to implement

     try {
          a huge bunch of instructions where some are additions
     }
     on integer_overflow_from_wherever do whatever
The former is no cost if implemented, and exists in every CPU. The later doesn't exist, but who says that only later is what has to exist? Why not implementing the no-cost former?

> It still has some C code today

Exactly my point why curl has no reason to be reimplemented in Rust now.

> I'm claiming that

This is different than the claims that were made when we made this decision. Specifically, the cost angle. You can absolutely do this, but we don't turn it on by default due to its expense. See https://danluu.com/integer-overflow/ for example. From that link:

> Summing up, integer overflow checks ought to cost a few percent on typical integer-heavy workloads

That few percent matters. And the extensive discussion (in both directions) on HN: https://news.ycombinator.com/item?id=8765714

It would be reasonable to turn it on, but that's not the decision we made.

> Exactly my point why curl has no reason to be reimplemented in Rust now.

Again, I never said curl should be ported to Rust.

FWIW, the former "no-cost" thing does exist in Rust, in the form of "checked_..." methods on the various integer types, e.g. https://doc.rust-lang.org/std/primitive.i32.html#method.chec... .
> Second point: look at the Benchmark game, a fast Rust code example also typically has an "unsafe" code.

That's false. From[1], it looks like two submissions use `unsafe`. And one of those is for ffi to gmp.

[1] - http://benchmarksgame.alioth.debian.org/u64q/rust.html

OK. I only looked at the programs linked directly from: http://benchmarksgame.alioth.debian.org/u64q/rust.html
I think it's missing the point to count the programs that aren't the fastest, i.e. only one each of the spectral-norms and pi-digits are "interesting".
And apparently steveklabnik1 thinks it's missing the point to count other than "the latest version of each program".

https://www.reddit.com/r/programming/comments/62cx5d/addendu...

There are many ways to be selective with evidence.

As-a-matter-of-fact all the contributed Rust programs I listed use unsafe.

> Benchmark game, a fast Rust code example also typically has an "unsafe" code

So you say but do not show.

Please show us which of those Rust programs you count as "fast".

Please show us how many of those programs have unsafe code.