Hacker News new | ask | show | jobs
by graycat 3865 days ago
Good thoughts. Thanks.

Two questions:

(1) IIRC, some operating systems, seeing some ECC errors, maybe just the uncorrectable ones or maybe also the correctable ones, moved to mark the associated memory, or block of memory, as faulty, maybe stopping the (applications) program using that memory, and continued on. Is this done with current operating systems?

(2) What would Windows Server do with a thread, process, address space or whatever the heck that encountered a memory error detected by ECC, especially one that was uncorrectable?

I'm eager to know since I'm eager to build a server, with ECC memory, and run Windows Server in production.

1 comments

(1) I have never heard of this behavior, correctable errors will be reported via MCA on Intel and uncorrectable ones will reset the system (and probably be logged in some firmware log).

(2) So as far is I know, the normal consequence of a detected multi bit error will be a system reset.