Hacker News new | ask | show | jobs
Sequoia: Speculative decoding boosting LLM inference by 8-10x (infini-ai-lab.github.io)
3 points by fgfm 822 days ago