Hacker News new | ask | show | jobs
by rahidz 126 days ago
Or Anthropic's models are intelligent/trained on enough misalignment papers, and are aware they're being tested.