| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by blackle 772 days ago
	My understanding is that minimizing perplexity (what LLMs are generally optimized for) is equivalent to finding a good probably distribution over the next token.