Hacker News new | ask | show | jobs
by morpheos137 300 days ago
It is shocking to me that 99% of people on YC news don't understand that LLMs encode tokens not verbatim training data. This is why I don't understand the NYT lawsuit against openAI. I can't see ChatGPT reproducing any text verbatim. Rather it is fine grained encoding of style in a multitude of domains. Again LLMs do not contain training data, they are a lossy compression of what the training data looks like.