Difficulty to define rigorously does not preclude its existence or usefulness as model. The paper addresses how it feels humans are different from LLMs in reference to meaning.
Reliably conferring the same mental model to another entity regardless of syntactic differences, or just failing to do so in a way that isn't predictable by a bell curve. The paper makes the argument that humans modelling the mental state of their conversation partner is part of how reliable meaning is exchanged, something that LLMs are unable to do because it is completely absent from their training data.