Hacker News new | ask | show | jobs
by zone411 1236 days ago
The author said that he comes up with new questions every year.
1 comments

Still questions that I would assume are semantically similar to the questions you can find in exam prep material all over the internet. My point is that exams are a crutch we use to determine how well a person studied a subject. A crutch we use because we seem to lack better measuring devices. It's very much possible to ace an exam while at the same time being horrible at actually applying/working on a subject. I'd argue therefore that measuring how well LLMs perform on exams designed for humans is simply a more complicated Turing test, with all its shortcomings.