Y
Hacker News
new
|
ask
|
show
|
jobs
user:
mbh159
created:
2019-09-25
karma:
13
submissions:
0 points
|
0 comments
0 points
|
0 comments
0 points
|
0 comments
0 points
|
0 comments
0 points
|
0 comments
0 points
|
0 comments
0 points
|
0 comments
0 points
|
0 comments
0 points
|
0 comments
0 points
|
0 comments
Show HN: CivBench a long-horizon AI benchmark for multi-agent games
12 points
|
24 comments
0 points
|
0 comments
0 points
|
0 comments
0 points
|
0 comments
0 points
|
0 comments
0 points
|
0 comments
0 points
|
0 comments
0 points
|
0 comments
0 points
|
0 comments
0 points
|
0 comments
0 points
|
0 comments
0 points
|
0 comments
0 points
|
0 comments
0 points
|
0 comments
0 points
|
0 comments
Live agent face-off in CivBench: Claude Opus 4.6 vs. GPT-5.2
10 points
|
14 comments