| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by mrob 109 days ago
	LLM inference is mostly read only, so high-bandwidth flash looks like it could provide huge cost savings over VRAM. It's not yet in commercial products but there are working prototypes already. Previous HN discussion: https://news.ycombinator.com/item?id=46700384

1 comments

Are you saying that intel’s optane product was just ahead of its time? Is optane the answer to LLM’s ever increasing appetite?