Hacker News new | ask | show | jobs
by dahart 3461 days ago
I can imagine bunches of possibilities. Paging can make things hard to predict, especially when multiple programs are allocating memory, but it doesn't make the system non-deterministic, nor does it make hitting the same physical address impossible.

One possibility is that he didn't restart the program between retries, and the memory in question was already allocated. Another possibility is that he only ran handbrake and nothing else, and the OS was in more or less the same state both times. It could be that the problem was triggered by stack allocations rather than heap allocations and the video block in question caused a large-ish recursion that hit the problem, and would be likely to hit the problem no matter what was running since it's somewhat rare to have large stack allocations.

Chances are it was actually none of those things, but they're real possibilities anyway.

3 comments

Maybe my Handbrake installation was broken because of defective RAM - I don't know exactly... anyway: I found the problem was RAM and now it works...
It's actually scary how much (unpredictable and maybe undetectable) stuff can happen due to bad RAM.
I once had a bad RAM socket. I sent back RAM that failed memtest86 and was rather confused when the next set failed in the same way.
Yeah! Bad bit baked into the executable is a strong possibility.
Presuming "frozen PC" means "unresponsive and must be forcibly rebooted, there would be no retrying with the program without restarting.

What with Windows Update and the variety of other similar OS- and application-level auto-updaters, is getting the computer into a very similar state likely? I'm not sure but my gut says no.

That said, at first I was imagining a desktop computer with 4 or 8 memory modules, but given a machine with just 2 modules, maybe it follows that one module usually gets filled with "core stuff" and the second, defective module somewhat infrequently sees "big user stuff" after the first module is filled, and I guess that isn't too much of a head-scratcher when it comes to identifying the source of the problem.

> but it doesn't make the system non-deterministic

Actually, it does. It's possible to calculate WCET involving ram accesses, as the behaviour is deterministic; there's a set latency.

It's not possible whenever SWAP is involved, which is why most of the realtime world simply avoids swap. This is mentioned in the Genode handbook, if you're willing to dig into it more.

I guess I need to clarify I was talking about read/write location determinism, and not timing. Timing could affect things, but it also might not. The question is only whether the same bit was touched, not whether the entire system state is identical in every aspect.

I assumed that swap wasn't involved, and I was even going to mention it but decided against. While it is remotely possible, there's not much reason to suspect swap, Handbrake was actively running, doesn't normally use enough memory to start swapping, and people ripping DVDs usually know not to be doing other things and/or using all their memory while ripping.

That said, are you saying swap in the OS really is non-deterministic by design, or just hard to predict? And what does Genode have to do with this, assuming he wasn't using Genode?