Y
Hacker News
new
|
ask
|
show
|
jobs
by
apophis-ren
502 days ago
Flash attention is an implementation trick; you can implement MHA/GQA, for example, with flash attention.