Hacker News new | ask | show | jobs
by breckenedge 459 days ago
Since you’re releasing the code to GitHub, do you think you’ll eventually run into issues with the training data including prior versions of the game?
1 comments

The implied scenario being that the memory of its own output would result in the model producing degraded future output? Why is that a given?
Probably the same reason that close relatives marrying each other for generations produces genetic problems.
Not the same reason at all. In genetics the reason is that you're losing gene variety and eventually recessive genes aren't suppressed anymore. In case of LLM it's just error accumulation.
It's a few days late but "losing gene variety" isn't the cause. What happens is genetic errors compound and are more likely to be expressed. I.E. "error accumulation".
You're wrong. You clearly have the Internet, I don't understand why won't you just google it and learn about it instead of claiming stuff that is bs.
How about a number of grad level genetics courses? Does that beat your google search? Because that is what I have. And what I am telling you is what happens.

This is really easily searched (as you said).

You might read up on it if interested. Check out why inbreeding can lead to expression of genetic defects. What is the mechanism? (hint: it's not "losing gene diversity" or "suppression").

Read about model collapse. The TL;DR is garbage in, garbage out.

https://en.wikipedia.org/wiki/Model_collapse