Hacker News new | ask | show | jobs
by grey8 981 days ago
The Huggingface page of Replit 3Bs says "The model has been trained on the MosaicML platform on 128 H100-80GB GPUs."

Source: https://huggingface.co/replit/replit-code-v1_5-3b

I'm not an ML engineer, just interested in the space - but as a general ballpark, training these models from scratch needs hundreds to thousands of GPUs.