Hacker News new | ask | show | jobs
by godelski 242 days ago

  > It won't generate the word eggs even though eggs probably comes after lay frequently
Even a simple N-gram model won't predict "eggs". You're misunderstanding by oversimplifying.

Next token prediction is still context based. It does not depend on only the previous token, but on the previous (N-1) tokens. You have "cat" so you should get words like "down" instead of "eggs" with even a 3-gram (trigram) model.