Hacker News new | ask | show | jobs
by PoignardAzur 1190 days ago
Wait, the ARC team didn't do their tests in a closed network? And they had it interact with actual people?

That's... well, it's probably fine given what they knew about the model capabilities, but it's a pretty crappy precedent to set for "protocol for testing whether our cutting edge AI can do large-scale damage".

1 comments

I don't think we should assume they know about their capabilities. They seem surprised with each iteration too.