|
|
|
|
|
by nathan_compton
1282 days ago
|
|
It can reproduce a statistically plausible paragraph, certainly. But there is a great deal more to research than producing statistically plausible paragraphs. It doesn't _understand_ anything! I've actually worked on a project where there have been attempts to use GPT like models to summarize scientific results and the problem is it gets shit wrong all the time! You have to be an expert to separate the wheat from the chaff. It operates like a mendacious search engine pretending to be a person. |
|
The good thing is that we'll be able to generate training data with our models by filtering the junk with the verifiers. Then we can retrain the models. It's important because we are getting to the limit of available training data. We need to generate more data, but it's worthless unless we verify it. If we succeed we can train GPT-5. Human data will be just 1%, the race is on to generate the master dataset of the future. I read in a recent paper that such a method was used to improve text captions in the LAION dataset. https://laion.ai/blog/laion-5b/