|
|
|
|
|
by tedsanders
3338 days ago
|
|
Fair, but that's mitigated if you have a rule that requires an ordering of the data points (say, chronologically). Then there should be no difference between two 500-data-point studies and one 1,000-data-point study partitioned in two (uniquely determined) halves. |
|
After you, the good guy, have specified which half of the data is the playground and which is the confirmatory test set, Evil Scientist can still run as many hypotheses as he feels like until he finds one that validates in both halves.
Under the rule "you can only validate a hypothesis by collecting a new data set dedicated to that hypothesis", we, the observers, have a way of guaranteeing that multiple comparisons did not occur. We have no such guarantee under the system you describe.
So to sum up: the rule I describe is not necessary in order to practice good statistics for your own benefit. But it is necessary in order to have a good statistical argument for convincing someone who can't directly perceive the contents of your mind. It's an auditing tool.