Y
Hacker News
new
|
ask
|
show
|
jobs
by
revel
408 days ago
They used RFT and there's only so many benchmarks out there, so I would be very surprised if they
didn't
train on the tests.