Hacker News new | ask | show | jobs
by randomsearch 2002 days ago
Worth noting that you can’t test a hypothesis from the data you used to formulate it.

I haven’t seen any work on this variant that doesn’t fall into that trap.

1 comments

I would really, really hope that they are using a good cross-validation strategy (as they'll definitely need it).
cross validation won't save you in this situation. it's a subtle point, but you can't say "omg this strain is increasing and it must therefore be more contagious" and then show that it is more contagious by showing it is increasing... this is counter-intuitive even to most scientists, in my experience. I remember when I learnt this during my PhD. I suspect many people high up in academia don't understand this point.
an observation that may help: as your stats at time t+1 are dependent on stats at time t, you cannot separate your validation set from the set you used to perform exploratory data analysis - they are highly interdependent.