|
|
|
|
|
by embedding-shape
209 days ago
|
|
Depends heavily on the architecture too, I think a free-for-all to find the better sizes is still kind of ongoing, and rightly so. GPT-OSS-120B for example fits in around 61GB VRAM for me when on MXFP4. Personally, I hope GPU makers instead start adding more VRAM, or if one can dream, expandable VRAM. |
|