Hacker News new | ask | show | jobs
by patrickhogan1 841 days ago
At the MIT event, Altman was asked if training GPT-4 cost $100 million; he replied, “It's more than that.”

Training costs are decreasing, but whenever there's an update to GPT-4 (e.g. training data cutoff updated to December 2023), it means the model has been retrained. The compute costs of training a model like Claude 2 are significant.

Also, keep in mind that not every trained model becomes ready for production. Some are discarded, similar to how you might burn a few cookies while baking.

1 comments

I don't think they always retrain models from scratch. Sometimes they might do continual learning (take the old model and train it on newer data)