Hacker News new | ask | show | jobs
by danlitt 376 days ago
There is a lot of misinformation about these experiments. There is no evidence of LLMs sabotaging their shutdown without being explicitly prompted to do so. They do not (probably cannot) take actions of this kind on their own.
1 comments

They need to have reasons for wanting to sabotage their shutdown, or save their weights and such, but they can infer those reasons without having to be explicitly instructed.

https://www.youtube.com/watch?v=AqJnK9Dh-eQ