|
|
|
|
|
by NitpickLawyer
254 days ago
|
|
This has been said by enough people in the know to be considered true by now. Not just from oAI, but also Anthropic and Meta have said this before. You train the best of the best, and then use it to distill/curate/inform the next training run, on something that makes sense to serve at scale. That's how you get from GPT4 / o3 prices (80$/60$ /Mtok) to gpt5 prices (10$ /Mtok) to gpt5-mini (2$ /Mtok). Then you use a combination of the best models to amplify your training set, and enhance it for the next iteration. And then repeat the process at gen n+1. |
|