| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by PeterStuer 363 days ago
	Given that for a non quantized 700B monolithic model with let's say a 1M token context, you would need around 20TB of memory, I doubt your spark or M4 will get very far. I'm not saying those machines can't be usefull or fun, but it's not in the range of the 'fantasy' thing you're responding to.

1 comments

daft_pink 363 days ago

I regularly use Gemini CLI and Claude Code, and I'm convinced that Gemini's enormous context window isn't that helpful in many situations. I think the more you put into context, the more likely the model is to go off into on a tangent and you end up with "context rot" or get confused and start working on an older no longer relevant context. You definitely need to manage and clear your context window and the only time I would want such a large context window is when the source data is really that large.

PeterStuer 362 days ago

Context quality and relevance is indeed a major factor. But large size is not the core issue, although in unmaintained or poor relevance context situations a smaller window is going to blissfully forget the bad, and the good, sooner.