Y
Hacker News
new
|
ask
|
show
|
jobs
by
cpldcpu
516 days ago
These benchmarks are mostly focused on math, which benefits a lot from an improved CoT and is also less sensitive to having "reduced knowledge" in smaller model.
Vibes are important in this case...