Hacker News new | ask | show | jobs
by _bxg1 2563 days ago
Programmers, especially naive ones, tend to think asymptotically. We focus on what would be the most theoretically ideal and write off pithy concerns like "time" and "effort" as implementation details. I'm guilty of this myself.

Rust is this unusual situation where a new language is in many ways a strict improvement over a very widely-used preexisting language. That is extremely rare. So I think that's where the religious fervor comes from.

Would the world be a better place if Tor and the Linux Kernel and every C++ project out there had originally been written in Rust (setting aside the fact that this was obviously impossible)? Sure, maybe. But those projects also benefit from decades of refinement and bug fixes. Rewriting them means losing all of that. As an idealistic programmer it's tempting to write factors like legacy refinement off into the margins, but they're actually very significant. A piece of software's value doesn't solely reside in its preconceived design.

2 comments

> those projects also benefit from decades of refinement and bug fixes. Rewriting them means losing all of that.

Not my experience at all, having spent a significant chunk of my career doing rewrites for scalability. Having the existing codebase makes it very easy to benefit from the accumulated knowledge, while at the same time doing a pass through that can catch a lot of outright errors. If you can rewrite a project incrementally while keeping it working (and for C->Rust, you can), then the arguments against rewriting your code don't apply, and I'd actually consider full rewrites a useful exercise even if you weren't rewriting to a better language.

Maybe. But even with the old code base for reference, it's easy to not see that those weird lines right there are written the way they are to handle two different corner cases (and of course there's no helpful comments to explain it). This is especially true if you're rewriting from $MORE_OPAQUE_LANGUAGE to $CLEARER_LANGUAGE. There can be a lot of not obvious things going on in the opaque stuff, and you can drop things you need in the rewrite unless you are very careful.

As carlmr points out, tests can help a lot... if you've got tests for those corner cases. They can be new tests, created as part of the rewrite - if anyone remembers that they are needed, or if anyone can see the corner cases either in the problem space or in the code. (This argues for doing the rewrite while the original authors are still around, preferably with them as part of the rewrite team.)

Right. Put differently: any sufficiently long-lived codebase contains institutional knowledge. Knowledge that goes deeper than just the explicit interface boundaries that you're planning to re-implement. That doesn't mean rewrites are a bad idea, just a disruptive one. They'll introduce new fragility for a while before they start being a net gain. Code has to be broken-in, like a new pair of shoes.
> There can be a lot of not obvious things going on in the opaque stuff, and you can drop things you need in the rewrite unless you are very careful.

If the knowledge is so opaque as to be nonobvious from the code in front of you, then it's already an accident waiting to happen: someone making normal maintenance changes to that code (not a rewrite) would be equally likely to break those properties.

I agree. I made rewrites all the time (and from lang to lang, to architecture to another). Rewrites are a net plus for me.

What is important is that $new_lang or $new_arch MUST help to cut in big ways problems and/or code.

To say something minimal, Rust eliminate (except when interfacing with $old) NULLs. Also, pattern matching make a lot of problems go away, FOREVER.

This mean, that is like have another developer also helping in cut work and at the same time increase quality.

This is not limited to compilers. For example, I work for business apps and use RDBMS a lot. Use a RDBMS well? It cut work and increase quality.

Rewriting queries, schemas and using indexes, views and ANY trick the RDBMS allow you (the idea "not use a DB at full because 'data independence' is poison if not VERY well justified) help by spades.

So:

- Naive rewriting = VERY BAD

- Move from $old to $new WITHOUT the help of a "improved" architecture, feature, tooling or concept = WASTED

- Move from $old to $new with better? TOTAL WIN.

P.D: I think move from too similar Langs probably will waste stuff. You see much more bang from your buck when move from imperative to functional, than from imperative to imperative. Not because functional is superior, but because it force to bring power (use of inmutable and algebraic types, for example) that before was dependent on discipline...

My experience as well. The key point here is the incremental part. This works especially well if you're adding regression tests incrementally as well. This necessitates a certain modularity & testability in the legacy project though, which often isn't the case. There's a certain threshold in technical debt (depending on the size of the project), at which point a complete rewrite may be better.

For most of the other projects Working Effectively With Legacy Code by Michael Feathers contains a good approach for slowly modularizing and testing the system, before exchanging parts.

Incidentally, Tor has some Rust in it these days. I haven't heard from them for a while, so I don't know how the project is going...
Tor is probably one of the better cases because of its stringent security requirements, but there remains an oft-ignored tradeoff of losing all that refinement work (and the fact that you can't just flip a switch to make it happen). It's a cost/benefit decision.
Totally; that's why it's not a wholesale re-write, it's slowly replacing old code bit by bit, or writing new code.

Which is the only real way to migrate any codebase of significant size to a new language.