Hacker News new | ask | show | jobs
Accelerate CPU Based LLM Inference with a Vector Index on the Output Embeddings (martinloretz.com)
1 points by dithered_djinn 459 days ago