Hacker News new | ask | show | jobs
by kevinnk 3291 days ago
Manufacturing logic on the same wafer as DRAM is difficult since the processes are so different. On the other hand, manufacturing on separate dies and connecting with an interposer or TSVs gets you tremendous bandwidth and relatively low latency; this is how some newer generation graphics memory is implemented (see https://en.m.wikipedia.org/wiki/High_Bandwidth_Memory).
2 comments

CPU logic on DRAM process is apparently a solved problem: http://venraytechnology.com/Implementations.htm
You can make logic on a DRAM process, but either your logic will be slow or you'll have to increase the cost of your cost/power consumption of your DRAM cells. With separate chips connected with TSVs you can have your cake and eat it too (each process can be specialized for it's components) plus you get the added benefit of increased yields (chips are smaller, no special processing steps).
That's kind of the entire point of Venray's claims: The CPUs are both plentiful and fast, and with far lower power consumption than computationally equivalent strength discrete CPU + RAM combos. They apparently only add a couple percent to the die size of the DRAMs.

However, their business model presumes "Wait until a DRAM manufacturer buys us", which IMO is why nothing's moved forward. DRAM manufacture is low-margin and not really the place to look for this kind of risky introduction to the market. I'd love to see this form of parallelism, and their take on breaking the memory bandwidth wall; it meshes great with the types of problems I work on.

The point of my post was there's not much upside that integrated DRAM/logic has that TSVs don't, but plenty of downsides. Regardless of Venray's claims, there's a reason modern high performance parallel architectures go with TSVs and interposers (Knights landing, new GPUs, some deep learning platforms, etc...) instead of logic in DRAM.
Part of that is simply silicon design inertia, though.

Do you see interposer style designs as linking up terabytes of DRAM? (at least in the near future) All the chips you're talking about are pretty major dies, not really suitable for having many stacks of them in conventionally tightly spaced DIMM arrays to reach such RAM sizes.

Of course, 3d chip advances might throw all current assumptions out the window and change the layout of everything.

> Do you see interposer style designs as linking up terabytes of DRAM?

I don't think we're going to see a terabyte of dram on an interposer for a while (4GB is about the max you can get commercially right now). I'm not sure what you're trying to get at though; even with logic in DRAM you have to go off chip to get to terabyte levels, so I don't see the advantage.

> All the chips you're talking about are pretty major dies, not really suitable for having many stacks of them in conventionally tightly spaced DIMM arrays to reach such RAM sizes.

The stacking happens in package (<1mm thick). Your DIMM array is going to have to be pretty damn tight for that to matter.

> Of course, 3d chip advances might throw all current assumptions out the window and change the layout of everything.

TSVs are 3D (or "2.5" depending on the configuration). You should have thrown out the assumptions back in 2014.

But logic and SRAM are more compatible. I'm still curious because multiple processors having random access to a large memory is not easy. The scratchpad they mention sounds like a cache, but maybe it's something more than that.
'scratchpad' implies they are directly addressable, i guess you could say a software controlled cache.