|
|
|
|
|
by thinkingtoilet
132 days ago
|
|
We know Open AI got caught getting benchmark data and tuning their models to it already. So the answer is a hard no. I imagine over time it gives a general view of the landscape and improvements, but take it with a large grain of salt. |
|
We had access to the eval data (since we funded it), but we didn't train on the data or otherwise cheat. We didn't even look at the eval results until after the model had been trained and selected.