Hacker News new | ask | show | jobs
by chillee 674 days ago
These benchmarks are on Ampere, where FA3 has no performance benefits over FA2.

On Hopper, FlexAttention is currently about 80% of FlashAttention3's performance (about 500 TFLOPs peak)

1 comments

Not bad.