Hacker News new | ask | show | jobs
by imtringued 613 days ago
You mean this https://lmsys.org/blog/2024-02-05-compressed-fsm/
1 comments

It has a clever way to decode multiple valid tokens at once, rather than just one token at a time.

Corresponding project link: https://github.com/sgl-project/sglang