Hacker News new | ask | show | jobs
by brucedawson 4204 days ago
A hundred cycles to zero a page? If 4 KB can be written in a hundred cycles then, assuming a 3 GHz processor, that's 30 million pages per second or ~120 GB/s. That's pretty fast memory. On x86/x64 processors, which lack a zero-the-cacheline instruction, the memory will also end up being read, so you need 240 GB/s to clear pages that quickly. This ignores the cost of TLB misses, which require further memory accesses.
1 comments

The zeros don't need to get pushed to memory immediately. They go to cache, where they will typically be overwritten with your real data long before they are pushed out to memory. That push of your real data would have needed to happen anyway, so there (usually) minimal extra cost associated with the zeroing.

There are, of course, pathological cases where you touch one byte on a new page, and then don't write anything else before the whole thing gets pushed out, but they are relatively rare in performance-critical contexts.