|
|
|
|
|
by westurner
204 days ago
|
|
What do other models trained on the same problems score? What about if they are RL'd to not reproduce things word for word? Why do you think that the 2024 Putnam programs that they used to test were in the training data? /? "Art of Problem Solving" Putnam https://www.google.com/search?q=%22Art+of+Problem+Solving%22... From p.3 of the PDF: > Curating Cold Start RL Data: We constructed our initial training data through the following
process: > 1. We crawled problems from Art of Problem Solving (AoPS) contests
, prioritizing math
olympiads, team selection tests, and post-2010 problems explicitly requiring proofs, total-
ing 17,503 problems. |
|
They reference https://artofproblemsolving.com/community/c13_contest_collec... for the source of their scrape and the Putnam problems are on that page under 'Undergraduate Contests'.