How to compute LLM embeddings 3X faster with model quantization

Y	Hacker News new \| ask \| show \| jobs

	How to compute LLM embeddings 3X faster with model quantization (medium.com)
	2 points by shutty 943 days ago