Y
Hacker News
new
|
ask
|
show
|
jobs
by
bhickey
893 days ago
Why they aren't computing the next token marginal and sampling that? All I'm coming up with is that it's a reasonable way to work around dealing with different tokenizers.