Hacker News new | ask | show | jobs
by mdp2021 527 days ago
> multi token prediction of sufficient length

Is multi token prediction the same as predicting the embedding of a complex token (the articulation of those input tokens in a sentence)?

1 comments

To be honest I don’t know. Maybe the only way to know is to build and measure all these variations.