| HN Mirror

I haven't touched ooba in a while, what's the situation like with exl2 vs the non-homogeneous quantization methods people are using like q3k_s or whatever. IIRC while exl2 is faster the gptq quants were outperforming it in terms of accuracy esp at lower bit depths.