Hacker News new | ask | show | jobs
by jpgvm 482 days ago
Ahh I missed that. Yes prefix caching and RAG are 2 cases were you will want something like this during inference time.