| > Models don't do that though, only if you run them in a loop with tools they can call, so mostly don't do that. That's also a description of DNA and RNA. They're chemicals, not magic. And there's loads of people all too eager to put any and every AI they find into such an environment[0], then connect it to a robot body[1], or connect it to the internet[2], just to see what happens. Or have an AI or algorithm design T-shirts[3] for them or trade stocks[4][5][6] for them because they don't stop and think about how this might go wrong. [0] https://community.openai.com/t/chaosgpt-an-ai-that-seeks-to-... [1] https://www.microsoft.com/en-us/research/group/autonomous-sy... [2] https://platform.openai.com/docs/api-reference [3] https://www.theguardian.com/technology/2013/mar/02/amazon-wi... [4] https://intellectia.ai/blog/chatgpt-for-stock-trading [5] https://en.wikipedia.org/wiki/Algorithmic_trading [6] https://en.wikipedia.org/wiki/2007–2008_financial_crisis |
I don't think "AI safety" is the right abstraction because it came from the idea that AI would start off as an imaginary agent living in a computer that we'd teach stuff to. Whereas what we actually have is a giant pretrained blob that (unreliably) emits text when you run other text through it.
Constrained decoding (like forcing the answer to conform to JSON grammar) is an example of a real solution, and past that it's mostly the same as other software security.