Hacker News new | ask | show | jobs
by lukebechtel 103 days ago
Yes, speculative decoding will make both us and VLLM faster, but we believe it would be a relatively even bump on both sides, so we didn't include it in this comparison. Worth another test!