Y
Hacker News
new
|
ask
|
show
|
jobs
New deepseek paper: Natively Trainable Sparse Attention mechanism
(
twitter.com
)
5 points
by
redlock
482 days ago
1 comments
eunos
481 days ago
Authored and Uploaded by none others than Liang Wenfeng himself
link