Hacker News new | ask | show | jobs
by loudmax 796 days ago
I like that they say how the model was trained for 1.3 hours on 4 nodes of 8 x H100s. By my rough calculation, that should probably have cost around $100 or so. (At $2 per hour, x 8 gpus x 4 nodes). Not free, but pretty cheap in the scheme of things. At least, once you know what you're doing.