|
|
|
|
|
by jeremyjh
287 days ago
|
|
I think we're talking past each other, I'll try once more. Suppose you train an LLM on a very small corpus of data, such as all the content of the library of congress. Then you have that LLM author new works. Then you train a new LLM on the original corpus plus this new material. Do you really think you've addressed the core issue in the SP? Can more parameters be meaningfully trained even if you add more GPU? To me, the answer is clearly no. There is no new information content in the generated data. Its just a remix of what already exists. |
|