Hacker News new | ask | show | jobs
by PaulHoule 1171 days ago
That's another problem with that article. The gap between API and fine-tuning is much smaller than the gap between fine-tuning and developing a foundation model. I would look at BloombergGPT as an example

https://arxiv.org/abs/2303.17564

here you have a company which can make a document collection about the same size as "The Pile", add that to "The Pile" and train a model based on that. They're not just a big company but they are in the information business so it is clear that it's worth it to them.