Or why the LLM doesn’t do a lookup into a subset of the training data as a database and reject the output if it seems to be wrong. A billion of the most urls and the entirety of Wikipedia, arkiv and stackoverflow would go a long way.
Because if the llm could tell right from wrong, it wouldn't have to do this in the first place. It's like the bible clainming it's true because the bible says it's true. Circular logic.