Hacker News new | ask | show | jobs
by perfmode 1023 days ago
What’s the difference?
1 comments

You end up paying more in the latter instance.
Not counting the cost of learning how to cluster together 500 GPUs, the cost of learning how to train models efficiently on 500 GPUs, the cost of convincing a cloud provider to let you get 500 GPUs, the cost of trying to find a cloud provider that actually has 500 GPUs you can book, etc, etc.