Hacker News new | ask | show | jobs
by hmokiguess 118 days ago
You should publish your evaluation set, that seems pretty interesting!

What’s your favourite one?

2 comments

Why would you ask that? The whole point of making it private is to avoid it leaking into the training data.
I thought open benchmarks helped, sorry, guess I was being naive.
Ha, sorry, I was a bit brusque there.

Open benchmarks do help, but they mostly help the vendors, not we the users!

Keeping tests private is the only way to keep them valid.