Hacker News new | ask | show | jobs
by treprinum 559 days ago
Quantized sure but there is some loss of variability of the output one can notice quickly with 30B models. If you want to use the fp16 version you are out of luck.