| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by nixon_why69 31 days ago
	Why not have a bunch of SRAM and various operations like "Q4 matmul" in silicon? Model weights and even architectures could still evolve on a platform like that.

2 comments

ac29 31 days ago

Doesnt "a bunch of SRAM" top out at maybe a few gigs per chip (with zero area used for logic)? You'd need an order of magnitude more to fit even a fairly weak general purpose LLM model.

link

throwa356262 31 days ago

I belive that is what NPUs are.

The issue is the very huge amount of DRAM and high bandwidth these model require.

link