Y
Hacker News
new
|
ask
|
show
|
jobs
by
wskwon
1099 days ago
Thanks for the explanation! I believe the two ideas are basically orthogonal. FlashAttention reduces memory read/writes, while PagedAttention reduces memory waste.