|
|
|
|
|
by cheptsov
556 days ago
|
|
In this one we were only using 3.1 405B FP8. We took one model to simplify the setup and were mostly looking at the memory saturation effect. So basically we compared inference metrics of the same model. I suppose comparing 3.1 and 3.2 will be difficult as they are different models entirely. But open to ideas |
|