| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by _ink_ 125 days ago
	Interesting. Is it because they can or is it really more expensive for them to process bigger context?

2 comments

cube2222 125 days ago

Attention is, at its core, quadratic wrt context length. So I'd believe that to be the case, yeah.

link

pkaye 125 days ago

I've read that compute costs for LLMs go up O(n^2) with context window size. But I think it is also a combination of limited compute availability, users preference for Anthropic models and Anthropic planning to go IPO.

link