| Completed a full rewrite of many components of the Kraken.com backend in about 4 years. The new system is around 1.5M loc of Rust. There was no serious alternative to rewriting, sometimes you find yourself in a corner and need to fix issues, and pay the price. I wrote about it 3 years ago here https://blog.kraken.com/product/engineering/oxidizing-kraken... Everything in that blog post still rings true and hindsight is that it were were right. But it was a massive grind and required extreme dedication to get it done, for a variety of reasons that work was very taxing. We also didn't stop feature development and kept the two systems running concurrently (which explains why it took so long, also growing and training a new team 10x the size took time, so there are many factors). I'm also against rewrites if I can help it, but reality is complex and sometimes we can't help it. Now however, since we removed the last pieces of legacy that were preventing larger DB schema changes (or required massive, unreasonable changes to the legacy systems), we've been shipping faster and easier than ever and caught-up on a lot of the accumulated backlog, including some of the more ambitious projects that were unthinkable in the legacy systems due to limitations. |
Looking back, is there anything that you would have done differently? I find that half or more of the rewrites that I have dealt with have been driven by all the wrong motivations. You get inevitable turnover and at some point people dislike code that they didn't write themselves and push for a rewrite, maybe changing the stack to something trendy, justifying it with thin arguments. Once the rewrite starts the company ends up treading water for years while incurring a ton of costs. For me, I think only 1 rewrite that I was part of was a good decision in my 15 years in tech. If I could go back in time, I think I would kill all rewrite discussions the moment that someone first whispers the idea.
How did you guys enjoy switching to Rust? I assume the safety and performance benefits for the trading system are a huge plus (didn't Kraken trading go down for an entire week a few years ago?). Did you also rewrite the webapp backend in Rust as well? How has staffing and budgeting been affected? I would assume that the supply of Rust developers is much lower unless you train them in house. Rust sounds fun, but I can't imagine trying to justify a rewrite of a legacy system, a major tech stack change, and training/building a new team all at the same time.
Sorry for the onslaught on questions. The "rewrite it in rust" fever has spread to my work and I'm fighting myself on how to respond.