|
|
|
|
|
by ianand
795 days ago
|
|
They did invent their own functions to test if the results were due to these functions being on the training date. See the section on data contamination in the paper. Agree it both kind of makes sense (regression is the best way to predict the next token in this context) and kind of ironic (LLMs can do high school regression but can’t do elementary school long digit arithmetic). |
|