Hacker News new | ask | show | jobs
by tel 1162 days ago
We're getting kind of to the point where these words are hard to define, but it's worth really interrogating these. Ultimately, the context and the prompt don't change the fundamental nature of how the system was optimized.

You can try very hard to construct a prompt such that the most plausible continuations of that prompt are consistent with a notion of identity, but you can also easily witness pulling the AI out of that prompt. Jailbreaks do that work today.

But even then, it's, largely, continuing the prompt as well as possible. Many "identities" can satisfy that aim. See the idea of "Waluigis" for example.