| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by cpldcpu 516 days ago
	These benchmarks are mostly focused on math, which benefits a lot from an improved CoT and is also less sensitive to having "reduced knowledge" in smaller model. Vibes are important in this case...