|
|
|
|
|
by _urga
2025 days ago
|
|
The advertised bandwidth for RAM is not actually what you get per-core, which is what you care about in practice. If you want to know the upper bound on your per-core RAM bandwidth: 64 bytes (the size of a cache line) * 10 slots (in a CPU core's LFB or line fill buffer) / 100ns (the typical cost of a cache miss) * 1000000 * 1000 (to convert ns to ms to seconds) = 6400000000 bytes per second = 5.96 GiB per second RAM bandwidth per core There's no escaping that upper bound per core. Nanosecond RAM latencies don't help much when you're capped by the line fill buffer and queuing delay kicks in spiking your cache miss latencies. You can only fetch 10 lines at a time per core and when you exceed your 5.96 GiB per second budget your access times increase. If you compare with NVMe SSD throughput plus Direct I/O plus io_uring, around 32 GIB per second and divide that by 10 according to the difference in access latencies, then I think the author is about right on target. The point they are making is valid: it's the same order of magnitude. |
|
I've since bought all the pieces but the CPU; they are all sold out. So I got a 6 core 3600XT in the interim. I bought fairly high binned RAM and overclocked it to 3600Mhz, and was surprised to cap out at about 36GB/s throughput. Your 6GiB/s per core explanation checks out for me!