| > * ChatGPT's "inability to separate data from code" means every input, even training input, is an eval(). This is very true in GPT3, less true in GPT3.5, and even less true in GPT4. OpenAI is moving to separate system prompts from user prompts. The system prompt is processed first attempts to isolate the user prompt from the system prompt. It's fallible, but getting better. > * LLM's have to be assumed to be entirely jailbroken and untrusted at all times. You can't run one behind your firewall. This only makes sense if you also won't put humans behind your firewall. LLMs can only do things they are empowered to do, much like humans. The fact that there are scammers who send fake invoices to businesses or call with fake wire transfer instructions does NOT mean that we disallow humans from paying invoices or transferring money. We just put systems (training and technical) in place to validate human actions. Same with LLMs. > * The fate of millions of businesses, possibly humanity, rests on an organization that thinks they can secure an eval() statement with a blocklist. Counterpoint: the fate of humanity is also being influenced buy people who see the real similarities but don't understand the real differences between LLM inputs and eval(). |
The point isn't that you can't use LLM output, it's that you should always consider LLM output as potentially hostile. You can somewhat mitigate this by pairing a LLM with a deterministic system that only allows a predictable subset of behavior, but it's a tricky problem to remove completely.