Y
Hacker News
new
|
ask
|
show
|
jobs
by
blackle
772 days ago
My understanding is that minimizing perplexity (what LLMs are generally optimized for) is equivalent to finding a good probably distribution over the next token.