| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by jchw 2545 days ago

You seem pretty certain this will manifest as a noticeable performance hit. Can this kind of thing be easily measured in practice? I don't know if my workloads would be memory latency sensitive and worse, I don't even have a clue how I would find out.

I'm not too concerned though, since I already use Zen 1 and it looks to be around the same. If I had to take a shot in the dark, I'm guessing it's just a consequence of the chiplet designs of Zen and Zen 2 that enabled them to scale so well. If it is such a tradeoff and there is not future gains to be made on latency for Zen platforms, I can accept that.

Though it does make me curious if Intel will ever stumble upon the same problems.

2 comments

garkin 2545 days ago

2400 vs 3000 vs 3200 MHz RAM | Ryzen 2nd gen: https://youtu.be/TjMq-Nv6Mq8

It affects general tasks too, but with much less magnitude than gaming, because games are concerned with frame times and overall latency the most.

link

jchw 2545 days ago

Well, sure... but doesn't increasing the RAM clock increase the throughput as well?

What I'm asking is, is there a good way to test the effects of just memory latency?

link

loeg 2545 days ago

You can manually increase the DRAM access latency in BIOS, leaving clock alone, and measure. It impacts throughput somewhat, but may be a little closer to an apples-to-apples comparison. I don't know that anyone has attempted to do this for video games on Zen 2 specifically (especially considering Zen 2 is only commercially available for the first time, like, today).

link

garkin 2544 days ago

Here is a fresh Zen1 timings comparison from Reddit[1].

Most increase is between 3200cl14 vs 3200cl12. 12%. Difference between this two is almost purely a Latency.

Then compare 3200cl12 and 3600cl14 - 3%, marginally no increase. Almost no difference in latency, only throughput and IF.

Past 3200 RAM throughput and inter-core-communication (IF) has very little influence for Zen1 gaming. For Zen2 this scenario would differ in some ways but not too much.

[1]: https://www.reddit.com/r/Amd/comments/c9x8v7/2700x_memory_sc...

link

vbezhenar 2545 days ago

I think that linked list with randomly distributed nodes is a good benchmark. Each jump will be hit miss and cause load from RAM and you can't prefetch anything because next address depends on content of fetch. Performance of simple iteration of that linked list should correspond to random memory access performance.

link

viraptor 2545 days ago

> I don't know if my workloads would be memory latency sensitive and worse, I don't even have a clue how I would find out.

If you develop on something unix-ish valgrind's cachegrind will tell you about your L1 performance. On recent Linux you can get this straight from the kernel with `perf stat` https://perf.wiki.kernel.org/index.php/ (cache-misses are total misses in all levels)

The most basic question is: are you randomly accessing more then your processor's cache worth of memory?

link