|
|
|
|
|
by 3abiton
708 days ago
|
|
To add to the discussion, from a practical perspective, AMD hardware totally sucks and yet to have proper implementation with flash-attention-2. ROCm is moving to usable slowly, but not close to being even comparable with cuda. |
|