Hacker News new | ask | show | jobs
by nenaoki 483 days ago
An LLM would have to be two-faced in a sense to surreptitiously mess with code alone and be normal otherwise.

It's interesting and a headline worthy result, but I think the research they were doing where they accidentally found that line of questioning is slightly more interesting: does an LLM trained on a behavior have self-awareness of that behavior. https://x.com/betleyjan/status/1894481241136607412

1 comments

(NB "Two-faced LLMs" are apparently trivial: https://news.ycombinator.com/item?id=43121383)