Y
Hacker News
new
|
ask
|
show
|
jobs
by
ex3ndr
700 days ago
I am wondering why flash attention is like 5x slower with variable masking than without it? Lack of good masking support almost zeros out the optimizations
1 comments
chillee
700 days ago
Where are you seeing these benchmarks?
link