Hacker News new | ask | show | jobs
Accelerating LLM Inference with Parallel Draft Models (PARD) (amd.com)
1 points by dhruvdh 437 days ago