| HN Mirror

Hmm, I think any example where it can get stuck is going to be a bit contrived since really it's a question of how easy it is to recognize a valid prefix. Say for example you want the LLM to generate a valid chess match and it ends up in a situation with just 2 kings left. If you're not careful with your definitions you could end up in an endless loop that never ends.

That said if you know all valid prefixes in your language in advance then you can always realise when a token leaves no valid continuations.

> It absolutely will. But so will adding an extra newline

A newline is less likely to dramatically drop the quality, a greedy method could easily end driving itself into a dead end (if not grammatically then semantically).

Say you want it to give a weather prediction consisting of a description followed by a tag 'sunny' or 'cloudy' and your model is on its way to generate

    { 
      desc: "Strong winds followed by heavy rainfall.", 
      tag: "stormy" 
    }

If it ever gets to the 's' in stormy it will be forced to pick 'sunny', even if that makes no sense in context.