|
|
|
|
|
by lambda-research
612 days ago
|
|
The benchmark is matrix multiplcation with the shapes `(6, 1500, 256) X (6, 256, 1500)`, which just aren't that big in the AI world. I think the gap would be larger with much larger matrices. E.g. Llama 3.1 8B which is one of the smaller models has matrix multiplications like `(batch, 14336, 4096) x (batch, 4096, 14336)`. I just don't think this benchmark is realistic enough. |
|