How would it know what is our intention? There are plenty of quirky examples in the training set, it could be imitating any of them especially if T=high. What we need to do is to ask the model to review its answer by our criteria, it can't read minds, we have to tell it.