| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by embedding-shape 54 days ago
	Sounds like maybe using worse quantization on the bigger model? Quantization matters a lot for the quality, basically anything below Q8 is borderline unusable. If it isn't specified in a benchmark already it probably should.