Hacker News new | ask | show | jobs
by cpldcpu 516 days ago
These benchmarks are mostly focused on math, which benefits a lot from an improved CoT and is also less sensitive to having "reduced knowledge" in smaller model.

Vibes are important in this case...