| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by nosuchthing 162 days ago
	LLMs can't access the training data that's less than the statistically most common token, so they use a random jitter. With that randomness comes statistically irrelevant results.