|
|
|
|
|
by orange_puff
526 days ago
|
|
This is very interesting, but a couple of things to note;
1. o1 still achieves > 40% on the varied Putnam problems, which is still a feat most math students would not achieve.
2. o3 solved 25% of the Epoch AI dataset.
- There was an interesting post which calls into question how difficult some of those problems actually are, but it still seems very impressive. I think a fair conclusion here is reasoning models are still really good at solving very difficult math and competitive programming problems, but just better at ones they have seen before. |
|