| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by bob1029 389 days ago
	I think it is too risky to build a company around the premise that someone won't soon solve the quadratic scaling issue. Especially, when that company involves creating ASICs. E.g.: https://arxiv.org/abs/2312.00752

2 comments

qeternity 389 days ago

Attention is not the primary inference bottleneck. For each token you have to load all of the weights (or activated weights) from memory. This is why Cerebras is fast: they have huge memory bandwidth.

link

Havoc 389 days ago

Yeah also strikes me as quite risky. Their gear seems very focused on llama family specifically.

Just takes one breakthrough and it's all different. See the recent diffusion style LLMs for example

link