| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by montanalow 1101 days ago
	Quantization allows PostgresML to fit larger models in less RAM. These algorithms perform inference significantly faster on NVIDIA, Apple and Intel hardware. Half-precision floating point and quantized optimizations are now available for your favorite LLMs downloaded from Huggingface.