Hacker News new | ask | show | jobs
by amanrs 518 days ago
This is harder than it looks.

First "token-healing" doesn't work. Consider the case "app" where the most likely options are "ap|praisal" or "apple|sauce". You can't just sample all tokens that start with app, or you'd miss appraisal.

Second, it's easy to come up with a naive algorithm that samples from the true distribution. It's very difficult to make this algorithm efficient.