| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by throwaway77385 119 days ago

> spinning disks have been replaced by NVMe solid state drives with near-RAM I/O bandwidth

Am I missing something here? Even Optane is an order of magnitude slower than RAM.

Yes, under ideal conditions, SSDs can have very fast linear reads, but IOPS / latency have barely improved in recent years. And that's what really makes a difference.

Of course, compared to spinning disks, they are much faster, but the comparison to RAM seems wrong.

In fact, for applications like AI, even using system RAM is often considered too slow, simply because of the distance to the GPU, so VRAM needs to be used. That's how latency-sensitive some applications have become.

2 comments

fluoridation 118 days ago

>for applications like AI, even using system RAM is often considered too slow, simply because of the distance to the GPU

That's not why. It's because RAM has a narrower bus than VRAM. If it was a matter of distance it'd just have greater latency, but that would still give you tons of bandwidth to play with.

link

dist-epoch 118 days ago

You could be charitable and say the bus is narrow because it has to travel a long distance and this makes it hard to have a lot of traces.

link

fluoridation 118 days ago

It's not. It's narrow even between the CPU and RAM. That's just the way x86 is designed. Nvidia and AMD by contrast have the luxury of being able to rearchitect their single-board computers each generation as long as they honor the PCIe interface.

It is also true that having a 384-bit memory bus shared with the video card would necessitate a redesigned PCIe slot as well as an outrageous number of traces on the motherboard, though.

link

adrian_b 118 days ago

Traditionally, the width of the GPU memory interfaces was many times greater than that of CPUs.

However the maximum width in consumer GPUs, of up to 1024-bit, has been reached many years ago.

Since then the width of the memory interfaces in consumer GPUs has been decreasing continuously, and this decrease has been only partially compensated by higher memory clock frequencies. This reduction has been driven by NVIDIA, in order to increase their profit margins by reducing the memory cost.

Nowadays, most GPU owners must be content with a memory interface no better than 192-bit, like in RTX 5070, which is only 50% wider than for a desktop CPU and much narrower than for a workstation or server CPU.

The reason why using the main memory in GPUs is slow has nothing to do with the width of the CPU memory interface, but it is caused by the fact that the GPU accesses the main memory through PCIe, so it is limited by the throughput of at most 16 PCIe lanes, which is much lower than that of either the GPU memory interface or the CPU memory interface.

link

dist-epoch 118 days ago

ThreadRipper has 8 memory channels versus 2 for a desktop AMD CPU. It's not an x86 limitation.

link

fluoridation 118 days ago

"x86" as in the computer architecture, not the ISA. Why do you think they put extra channels instead of just having a single 512-bit bus?

link

adrian_b 118 days ago

The memory interface of CPUs is made wider by adding more channels because there are no memory modules with a 512-bit interface. Thus you must add multiples of the module width to the CPU memory interface.

This has nothing to do with x86, but it is determined by the JEDEC standards for DRAM packages and DRAM modules. The ARM server CPUs use the same number of memory channels, because they must use the same memory modules.

A standard DDR5 memory module has a width of the memory interface that is of 64-bit or 72-bit or 80-bit, depending on how many extra bits may be available for ECC. The interface of a module is partitioned in 2 channels, to allow concurrent accesses at different memory addresses. Despite the fact that the current memory channels have a width of 32-bit/36-bit/40-bit, few people are aware of this, so by "memory channel" most people mean 64 bits (or 72-bit for ECC), because that was the width of the memory channel in older memory generations.

Not counting ECC bits, most desktop and laptop CPUs have an 128-bit memory interface, some cheaper server and workstation CPUs have a 256-bit memory interface, many server CPUs and some workstation CPUs have a 512-bit memory interface, while the state-of-the-art server CPUs have a 768-bit memory interface.

For comparison, RTX 5070 has a 192-bit memory interface, RTX 5080 has a 256-bit memory interface and RTX 5090 has a 512-bit memory interface. However, the GDDR7 memory has a transfer rate that is 4 to 5 times higher than DDR5, which makes the GPU interfaces faster, despite their similar or even lower widths.

link

throwaway77385 118 days ago

I can't edit my comment, but to the people responding here, thank you for adding all this information. It really helped elucidate why VRAM vs RAM is a distinction and also prevents my somewhat naive interpretation from being the only thing people see. Thanks!

link