Y
Hacker News
new
|
ask
|
show
|
jobs
by
muzani
135 days ago
There's a benchmark which works similarly but they ask harder questions, also based on books
https://fiction.live/stories/Fiction-liveBench-Feb-21-2025/o...
I guess they have to add more questions as these context windows get bigger.