|
|
|
|
|
by timbilt
310 days ago
|
|
> Unlike many public benchmarks, the PR Benchmark is private, and its data is not publicly released. This ensures models haven’t seen it during training, making results fairer and more indicative of real-world generalization. This is key. Public benchmarks are essentially trust-based and the trust just isn't there. |
|