Hacker News new | ask | show | jobs
by mgambati 3 hours ago
With 2 wouldn’t have good results. Ideal range for coding is at least Q8.
1 comments

According to this very article, 4-bit dynamic is essentially lossless
Watch out. Those claims are often made based on KL-divergence over some arbitrary corpus, not performance in the real world or benchmarks.

I’ve found that I need to go a couple steps past whatever quantizations are good enough in the KL-divergence testing to get good performance in real tasks with long context. So when Q4 is claimed to be lossless I end up with Q5 or Q6 for actual long-context tasks.