|
|
|
|
|
by randomtoast
65 days ago
|
|
Humanity's Last Exam (HLE) is already insanely difficult. It introduces 2,500 questions spanning mathematics, humanities, natural sciences, ancient languages, ... Here is an example question: https://i.redd.it/5jl000p9csee1.jpeg No human could even score 5% on HLE. |
|
That is, it's easy to make benchmarks which humans are bad at, humans are really bad at many things.
Divide 123094382345234523452345111 by 0.1234243131324, guess what, humans would find that hard, computers easy. But it doesn't mean much.
Humanity's last exam (HLE) couldn't be completed by most of humanity, the vast majority, so it doesn't really capture anything about humanity or mean much if a computer can do it.