Hacker News new | ask | show | jobs
by scott00 2531 days ago
A rollback without understanding is definitely risky. An uninformed rollback is one of the factors that killed Knight Capital Group in 2012. For those not familiar, the actual problem was they failed to update one of a cluster of eight servers, and the server on the old version was making bad trades. They attempted to mitigate with a rollback, which made all eight servers start to make bad trades. In the end they lost $460 million over the course of about 45 minutes.

The full report is here if you're curious: https://www.sec.gov/litigation/admin/2013/34-70694.pdf

1 comments

Knight Capital also didn't know what version of software their servers were running, didn't know which servers were originating the bad requests, had abandoned code still in the codebase, and reused flags that controlled that abandoned code (another summary: https://sweetness.hmmz.org/2013-10-22-how-to-lose-172222-a-s...). I'm not sure what you can infer about the risk of a rollback in a less crazy environment.