I think the biggest thing is to not give it access to anything like a shell (obviously), limit the call length, and give it a hangup command.
Then you tell it to just not answer off the wall questions etc. and if you are using a good model it will resist casual attempts.
I don't see being able to ask nonsense questions as being a big deal for an average small business. But you could put a guardrail model in front to make it a lot harder if it was worth it.
in general these types of attacks are still difficult to solve, because there are a lot of different ways they can be formulated. llm based security is still and unknown, but mostly i have seen people using intermediary steps to parse question intent and return canned responses if the question seems outside the intended modality.
Then you tell it to just not answer off the wall questions etc. and if you are using a good model it will resist casual attempts.
I don't see being able to ask nonsense questions as being a big deal for an average small business. But you could put a guardrail model in front to make it a lot harder if it was worth it.