Y
Hacker News
new
|
ask
|
show
|
jobs
by
projektfu
8 days ago
"At 1M tokens, SubQ 1.1 Small requires 64.5x less compute than dense attention and runs 56x faster than FlashAttention-2."
6450% less compute? Is Trump working there?