Hacker News new | ask | show | jobs
How to compute LLM embeddings 3X faster with model quantization (medium.com)
2 points by shutty 943 days ago