|
|
|
|
|
by gwern
784 days ago
|
|
> If you trained on specific generated data with real distributions It was trained on generated data from real distributions! The datasets LLMs are trained on include gigabytes of real data from real distributions, in addition to all of the code/stats/etc samples. The question you should be asking is 'why did it stop being able to predict real distributions?' And we already know the answer: RLHF. https://news.ycombinator.com/item?id=40227082 |
|