Y
Hacker News
new
|
ask
|
show
|
jobs
by
meatmanek
309 days ago
Reasoning models do a lot better at AIME than non-reasoning models, with o3 mini getting 85% and 4o-mini getting 11%. It makes some sense that this would apply to small models as well.