Y
Hacker News
new
|
ask
|
show
|
jobs
Predicting the Order of Upcoming Tokens Improves Language Modeling
(
arxiv.org
)
7 points
by
wavelander
293 days ago
1 comments
NitpickLawyer
293 days ago
Are any of these methods doable on pre-trained models? Like freeze the model and only train these add-ons? Having to redo the training runs with these optimisations doesn't sound too practical, in the great scheme of things.
link
impossiblefork
292 days ago
It's obviously practical for the next model you train from scratch. The point of research is obviously not to improve existing commercial products.
link