| HN Mirror

This is not a solution. It removes one degree of freedom, the ability to draw the "line" dividing one half of the data set from the other. But an evil or naive scientist has limitless other degrees of freedom to choose from, and can make as many comparisons (in the "multiple comparisons" sense) as they like, undetectably to you.

After you, the good guy, have specified which half of the data is the playground and which is the confirmatory test set, Evil Scientist can still run as many hypotheses as he feels like until he finds one that validates in both halves.

Under the rule "you can only validate a hypothesis by collecting a new data set dedicated to that hypothesis", we, the observers, have a way of guaranteeing that multiple comparisons did not occur. We have no such guarantee under the system you describe.

So to sum up: the rule I describe is not necessary in order to practice good statistics for your own benefit. But it is necessary in order to have a good statistical argument for convincing someone who can't directly perceive the contents of your mind. It's an auditing tool.