Hacker News new | ask | show | jobs
OpenAI o3 just scored 99.8% on CodeForces using brute-force (huggingface.co)
2 points by wluk 496 days ago
1 comments

"These results demonstrate that o3 outperforms o1-ioi without relying on IOI-specific, hand-crafted test-time strategies. Instead, the sophisticated test-time techniques that emerged during o3 training, such as generating brute-force solutions to verify outputs, served as a more than adequate replacement"

"The model not only writes and executes code to validate its solutions against public test cases, it also refines its approach based on these verifications.

Figure 6 shows an advanced test-time strategy discovered by o3: for problems where verification is nontrivial, it often writes simple brute-force solutions — trading efficiency for correctness — then cross-checks the outputs against its more optimized algorithmic implementations.

This self-imposed validation mechanism lets o3 catch potential errors and improve the reliability of its solutions."