|
|
|
|
|
by falcor84
60 days ago
|
|
They focus on minimizing the number of moves and don't allow any harness whatsoever, putting the bar extremely high. The current top verified contender (Claude Opus 4.6) is at only 0.45%. But with how new it is, I expect a lot of improvement in the next generation of models. |
|