| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by vlovich123 854 days ago
	Yup exactly. The only way independent hardware can help is if the fault is state dependent in a way on the hardware (eg differences in behavior due to thermal load or different internal state corruption or something) in which case repeated computations may not help if the repeated computation is not sufficiently decoupled temporally to get rid of that state. The other thing with independent hardware is that you don’t pay a 3x performance penalty (instead 3x cost penalty). That being said, none of these fault modes are what are really what is being discussed in the paper. The other one that freaks me out is miscompilation by the compiler and JITs in the data path of an application. Like we’re using these machines to process hundreds of millions of transactions and trillions of dollars - how much are these silent mistakes costing us?

1 comments

paganel 854 days ago

I think that strictly looking at it in terms of money-related operations stuff can still be managed/double-checked externally, i.e. by the real world, which means that whatever mistakes/inconsistencies might show up there's still a "hard reality" out there that will start screaming "hey! this money figure is not correct!" because people tend to notice when there are big money-discrepancies and the "mistakes" are, generally speaking, reversible when it comes to money.

What's worrying is when systems like these get used in real-time life-and-death situations, and there's basically no reversibility because that would imply dead people returning to life. For example the code used for stuff like outer space exploration, sure that right now we can add lots and lots of redundancies and check-ups in the software being used in that domain because the money is there to be spent and we still don't have that many people out there in space. But what will happen when we'll think of hosting hundreds, even thousands of people inside a big orbital station? How will we be able to make sure that all the safety-related code for that very big structure (certainly much bigger than we have now in space) doesn't cause the whole thing to go kaboom based on an unknown-unknown software error?

And leaving aside scenarios that are not there yet, right now we've started using software more and more when it comes to warfare (for example for battle simulations based on which real-life decisions are taken), what will happen to the lives of soldiers whose conduct in war has been lead by faulty software?

link

vlovich123 854 days ago

The financial impact was just to highlight a scope in terms that can result in a single calculable easy to understand number. And also most transactions are automated and rarely validated manually so I’m not sure how many inconsistencies we’re catching. Look at the UK post office scandal and that was basic distributed systems bugs used in the auditing software where the system was granted privilege over manual review (sure there’s lots wrong with that scandal but it is illustrative of how much deference we give to automated systems since that tends to be the right tradeoff to make).

link

jorticka 854 days ago

The recent Ukraine war shows that soldiers lifes are cheap - according to commanders.

So many soldiers on both sides died because of really dumb commander decisions, missing kit, political needs, that worrying about CPU errors is truly way way down the list.

link

paganel 854 days ago

At the tactical level, of course that what you're saying is true, but the big Ukrainian counter-offensive from last year had been preceded by lots and lots of allusions made to "war games simulations" set-up by Ukraine's allies in the West (mostly the US and the UK), and it is my understanding that those war games were heavily taken into consideration as a basis for that counter-offensive decision. I'm not saying that the code behind those simulations was faulty, I'm just saying that software is already used at an operational level (at least) when it comes to war.

As per the sources, here's this one in The Economist [1] from September 2023, just as it had become obvious that the counter-offensive had fizzled out:

> American and British officials worked closely with Ukraine in the months before it launched its counter-offensive in June. They gave intelligence and advice, conducted detailed war games to simulate how different attacks might play out

And another one from earlier on [2], in July 2023, when things were stil undecided:

> Ukraine’s allies had spent months conducting wargames and simulations to predict how an assault might unfold.

[1] https://archive.is/1u7OK

[2] https://archive.is/NyGJI

link

jorticka 854 days ago

It was mostly an old-school kind of wargames.

There was an article about a giant real 3D map in a room with all the commanders there discussing what each one has to do, contingencies, ...

The reason for the counter-offensive failure were multiple and complex, good or bad software would not have changed the result significantly.

link