Y
Hacker News
new
|
ask
|
show
|
jobs
by
aimanbenbaha
184 days ago
Deepseek v3.2 is that cheap because its attention mechanism is ridiculously efficient.
1 comments
esafak
184 days ago
Yeah,
DeepSeek Sparse Attention
. Section 2:
https://arxiv.org/abs/2512.02556
link