That is beside the point. My point is that detecting hallucinations seems like a very very hard problem.
The utility of it is there and has nothing to do with making up episodes instead of quoting the current one. Like you can ask it to write new episodes with specific settings and specific constraints. Hallucination is not the value add. Nobody is excited because it hallucinates. People are excited despite it since the other value add is too much.
But deliberately requesting and receiving content generation is altogether different from requesting a factual answer and receiving plausible-seeming nonsense. Or at least, it's different to the person asking; it's the same thing as far as the model is concerned.
The utility of it is there and has nothing to do with making up episodes instead of quoting the current one. Like you can ask it to write new episodes with specific settings and specific constraints. Hallucination is not the value add. Nobody is excited because it hallucinates. People are excited despite it since the other value add is too much.