Hacker News new | ask | show | jobs
by ac29 461 days ago
Some models are more sensitive to quantization than others, presumably AI Studio is running the full 16 bit model.

Try maybe the 8bit quant if you have the hardware for it? ollama run hf.co/unsloth/gemma-3-27b-it-GGUF:Q8_0

1 comments

I tested the full fp16 gguf