Y
Hacker News
new
|
ask
|
show
|
jobs
by
scv119
1098 days ago
I believe you can slightly change the flash attention kernel to implement the same kernel of this page attention, since both of them work on the key/value cache at block level.