Y
Hacker News
new
|
ask
|
show
|
jobs
by
rugina
781 days ago
I think NM translation was broken all along. Not in the neural network part but in choosing the right answer.
https://aclanthology.org/2020.coling-main.398.pdf
1 comments
astrange
781 days ago
Since LLMs are loosely based on NM models, it seems research on newer sampling methods like Mirostat might help here.
link