Y
Hacker News
new
|
ask
|
show
|
jobs
HWE Bench: A new unbounded Benchmark for LLMs (GPT 5.5 is on top)
(
hwebench.com
)
6 points
by
fesens
40 days ago
3 comments
fesens
40 days ago
Current benchmarks have ceilings, usually 100%. This benchmark aims to be a long lasting, high correlation with the ability to solve real world problems and follow complex instructions, and unbounded (meaning it can always go higher).
link
paulobeckhauser
39 days ago
Very nice!!
link
fabiofachini92
40 days ago
Amazing!
link