Hacker News new | ask | show | jobs
by Zenst 2077 days ago
I like what IBM has done with their latest power chip - effectually made the whole memory interface upgradable.

https://www.nextplatform.com/2020/09/03/the-memory-area-netw...

"the shift from dedicated DDR4 memory controllers to Serdes-based, high speed differential signaling mixed with buffer chips on memory modules that can be taught to speak DDR4, DDR5, GDDR6, 3D XPoint, or whatever, is an important shift in system design and one that we think, ultimately, the entire industry will get behind eventually."

3 comments

I think this makes sense for IBM, but probably not for AMD and Intel. AMD and Intel are putting out new chips every year (even if they're just Skylake refreshes), so it's not too big of a deal to change the memory interface, and they are capable of putting in two flavors of DDR support when some flexibility is needed. For IBM, I don't think they release incremental chip designs each year, so it makes more sense for them to take the tradeoff of a flexible interface, so they can get onboard with newer ram faster.
It is pretty much the current greed driven technology. Company doesn't want to invest in proper FPGA technology that will allow reconfiguration of chipsets on the fly, but instead they want people to buy new chipsets every year or so and preferably make the old ones obsolete (not resellable)
Computers have never used FPGA memory controllers that are upgradeable to newer DRAM standards so there's really no reason to say that's the "proper" solution. And memory standards overlap for 4-5 years which also happens to be the lifetime of a PC so the price/performance benefit of future-proofing really isn't there.
This kind of proves my point. It doesn't lead to maximisation of profits therefore it is not being developed.
How would the increased cost of a FPGA memory controller benefit the end user?

They would still need a new motherboard to use newer memory types, because the modules will have different connectors. They might need a new chipset if the newer memory type wasn't actually addressable with the FPGA (not enough pins, not enough signalling capacity, not enough voltage flexibility, etc).

For the majority of users, they don't change the cpu, motherboard, or memory for the life of the computer (in many cases, some or all of these parts are soldered to the board). Paying more for flexibility that will never be used isn't good for anyone.

I would agree that it supports your point. Proves it? no.
AMD and Intel have released multiple DDR4 chipsets over there last 4~ years. The chipsets releases have hardly been related to RAM type support. Typically it's power or PCIe related, with Intel being the worst offender.
RAM support is on the CPU die. The chipset doesn't connect the CPU to RAM anymore and hasn't for a very long time.
Did I say it did?

Ryzen has seen improvements to DDR4 module support with each generation and to take advantage of the latest generation you may need to upgrade your motherboard to one with the latest chipset, depending on vendor support. Even if your older board supports the newer generation, likely you have a more limited QVL for DDR4 modules, so compatibility and perf may be limited.

This is why I implied it had limited impact, though there is some truth to boards with newer AMD chipsets offering better memory support.

This was a response to parent comment stating it was greedy behavior to try to get consumers to buy new motherboards with new chipsets, when the chipsets have little impact on RAM compatibility/support.

Adds a latency hop, which generally can be dealt with by prefetching and larger caches on the CPU side.

There's speculation that AMD is going to do the same - in Zen 2 and later designs the CPU chiplets are coupled with different IO dies depending on the design (Ryzen, Threadripper, Epyc), and swapping out the IO die for one that has support for new/different memory types would less work than taping out a whole new monolithic CPU.

> CPU chiplets are coupled with different IO dies depending on the design

How does that differ from Intel's approach?

Intel uses monolithic dies, meaning everything is on the same chunk of silicon. This has given them some improvements to latency[1] and power usage[2] in the past but hurt them on yields. AMD has a chiplet design, which improves yields[3] and may allow a more modular approach as mentioned.

[1] can be overcome via caching and other considerations, but purely from this aspect the impact is this. [2] Longer traces lead to higher capacitance, and the power estimation formula P=C(V^2)f*a shows that this one aspect will change power use. Everything on one die means less parasitic capacitance. [3] If the defect density is the same, and if you have 10 errors per wafer, then you will see different yields if you make 10 vs 100 vs 1000 chips on that wafer. Chiplets are smaller than monolithic designs, so we can put more of them on one wafer which improves yield independent of process

Intel desktop CPUs are monolithic. A 10900K is a single die, meanwhile a AMD 3900X is three dies: two compute dies and one I/O die (which are sourced from two manufacturers on two different processes afaik). An AMD server CPU has the same compute dies, except more of them, and a very different IO die.

The AMD compute dies are only connected to the IO die and other compute dies (and power). All IO connections exclusively go through the IO die, so the IO die can be customized to change the IO of the CPU without changing anything about the compute dies. It would be entirely feasible to just re-spin the IO die to add support for different memory, Thunderbolt or other IO ports. The IO die is also made on a cheaper, lower-density and performance process (14 nm / 16 nm) than the compute dies (7 nm).

So AMD is back to having a northbridge again but it's on the same package instead of the motherboard for latency reasons? Or could we actually get away with a northbridge on the motherboard again?
Sort of - it allows them to build lots of different sorts of systems - many CPU chiplets and a mem controller, 1 CPU and a mem controller etc etc all from the same basic components - Intel have to spin a new chip for each SKU, AMD can build different SKUs by packaging stuff differently - it gives them a lot more flexibility and means they can spin out new stuff to fit a new market segment much more quickly
I know that motivation but I'm more curious about the hardware architecture angle. Integrating the memory controller in the CPU was supposedly a big gain at the time. Now it's in a different chip and multi-socket motherboards already have to traverse the board to access RAM attached to another chip. Are the interconnects better now and so going back to a single northbridge is workable? Would it simplify the topology in multi-socket systems to have all the RAM together instead of having to take care with process affinity to RAM? I'd love a source for discussion around these kinds of tradeoffs.
Why does it add a latency hop?
At 5ghz, the signals travel only 4 cm each cycle (signal speed in copper is around 60% lightspeed).

Moving stuff further away means waiting multiple cycles for a reply.

Random article:

> Power10 chip .. using the DDR4 buffer chip from MicroChip and only adding a mere 10 nanoseconds to memory latency.

At 5GHz that would be 50 cycles or 1m round-trip. The delay is in processing - bridging logical and technical protocols, resynchronization, fanout, probably even a cache layer.

Sounds a lot like FB-DIMMs.