|
|
|
|
|
by SomeoneFromCA
851 days ago
|
|
Interesting paper, but has some technical errors. First of all, they keep mentioning SRAM+ECC, instead of DRAM+ECC; you cannot use gcj to inspect assembly code generated for Java method, as it will be completely different from the code generated by Hotspot; you do not need all that acrobatics to get disasm of the method, you could just add an infinite loop to the code and attach gdb to the JVM process and inspect the code or dump the core. |
|
That's not a technical error, they mean SRAM in the CPU itself. You're right about gcj but that's kind of a moot point when investigating some reproducible CPU bug like this. The paper mentions all the acrobatics they went through when trying to find the root cause but if gcj would have been practical then it also would have been immediately clear if the gcj output reproduced the error or not. If it didn't reproduce, no big deal, try another approach. You might be right about it being easier to root cause with gdb directly but I'm not so sure. Starting out, you have no idea which instructions under what state are triggering the issue so you'd be looking for a needle in a haystack. A crashdump or gdb doesn't let you bisect that haystack so good luck finding your needle.