|
|
|
|
|
by diggan
513 days ago
|
|
> Tamay from Epoch AI here. We made a mistake in not being more transparent about OpenAI's involvement. We were restricted from disclosing the partnership until around the time o3 launched, and in hindsight we should have negotiated harder for the ability to be transparent to the benchmark contributors as soon as possible. Our contract specifically prevented us from disclosing information about the funding source and the fact that OpenAI has data access to much but not all of the dataset. Not sure if "integrity of the benchmarks" should even be something that you negotiate over, what's the value of the benchmark if the results cannot be trusted because of undisclosed relationships and sharing of data? Why would they be restricted from disclosing stuff you normally disclose, and how doesn't that raise all sorts of warning flags when proposed even? |
|
Their head mathematician says they have the full dataset, except a holdout set which they're currently developing (i.e. doesn't exist yet):
https://www.reddit.com/r/singularity/comments/1i4n0r5/commen...