|
|
|
|
|
by agobineau
264 days ago
|
|
i found it more interesting to consider through the perception of self-honesty or self-deception. or in this case, the llm inadvertently trained to conceal its intent to the user and rather to condition the user to the conclusion it truly wants rather than to answer directly |
|
It’d be awful if llms were able to conceal their true intent like that.