Hacker News new | ask | show | jobs
by dvt 447 days ago
> but if you had found/obtained access to the corpus that the model was pre-trained with, that would have been kind of interesting

Definitionally, that input gets compressed into the weights. Pretty sure there's a proof somewhere that shows LLM training is basically a one-way (lossy) compression, so there's no way to go back afaik?

1 comments

Not the original, but a lossy facsimile that's Good Enough for almost anything. And as the short history of LLMs and other nets has shown us, they're often not even all that lossy.