Hacker News new | ask | show | jobs
Frontier Models are Capable of In-context Scheming (arxiv.org)
10 points by trott 546 days ago
1 comments

https://arxiv.org/abs/2412.04984

> Our findings demonstrate that frontier models now possess capabilities for basic in-context scheming [covertly pursuing misaligned goals], making the potential of AI agents to engage in scheming behavior a concrete rather than theoretical concern.