You could use the same technique that this paper describes to compare the answers each LLM gave. LLMs don’t have to be in opposition to traditional NLP techniques