Hacker News new | ask | show | jobs
by remilouf 872 days ago
This article presents a way to make structured generation with LLMs much faster than standard generation, but what I find most interesting is how it highlights the issues that tokenization entails towards the end.