Hacker News new | ask | show | jobs
by anshorei 1262 days ago
> have the model output contain 90% of the original training content verbatim

considering the size of a model is but a fraction of the size of the training data, this statement is in no way accurate

1 comments

I'd think it's more comparable to compression. No one would argue that a jpeg or mp3 file can't contain a reasonable representation of the original because it is smaller.