Hacker News new | ask | show | jobs
by wskwon 1099 days ago
Thanks for the explanation! I believe the two ideas are basically orthogonal. FlashAttention reduces memory read/writes, while PagedAttention reduces memory waste.