Hacker News new | ask | show | jobs
PostgresML Adds GPTQ and GGML Quantized LLM Support for HuggingFace Transformers (postgresml.org)
4 points by montanalow 1101 days ago
1 comments

Quantization allows PostgresML to fit larger models in less RAM. These algorithms perform inference significantly faster on NVIDIA, Apple and Intel hardware. Half-precision floating point and quantized optimizations are now available for your favorite LLMs downloaded from Huggingface.