Configured a certain way, the game should respond by poking holes in the narrative that cheat credulity. Like the Kaiju transmitter will turn out to be a dud, told humorously and leading to the end of the story.
LLMs aren't built that way, they're text predictors. If the text begins with "it's massively successful", there was very few instances in the training data where this didn't actually result in success.
LLMs are built that way, with prompting this behavior can certainly be achieved. It's not going to work oerfectly and jailbreaks will still be possible, but not so easy.
Sure(-ish; finetuning, particularly, tuning on the specific kinds of inputs and appropriate responses applicable to the use case, can change this significantly), but the beginning of the prompt doesn't have to be the beginnibg of user input in an AI application.
I think there could be other means of getting the desired behaviour beyond letting the LLM do all the lifting. Perhaps original comment is misleading by use of the word configured. But by that I just meant a game setting (ie realism on).