| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by Nav_Panel 236 days ago
	Love it, they're teaching LLMs how to skim texts properly, which is exactly the right approach for handling long contexts.

1 comments

ProofHouse 236 days ago

wasn't this the attention sink concept to some degree? I mean it doesn't seem out of the realm of possibility that if the latency overhead isn't signifigant, that frontier models start adopting similar to DeepSeek OCR tech

link