Y
Hacker News
new
|
ask
|
show
|
jobs
by
ProofHouse
236 days ago
wasn't this the attention sink concept to some degree? I mean it doesn't seem out of the realm of possibility that if the latency overhead isn't signifigant, that frontier models start adopting similar to DeepSeek OCR tech