|
|
|
|
|
by declaredapple
851 days ago
|
|
How many tokens/s are we talking for a 70B model? Last I saw they performed really poorly, like lower single digits t/s. Don't get me wrong they're probably a decent value for experimenting with it, but is flat out pathetic compared to an A100 or H100. And I think useless for training? |
|