Hacker News new | ask | show | jobs
by barisser 4256 days ago
Here's another way to think of it.

If the parameter space for my model includes, let's say 10 binary decisions (which is very conservative), that's 1024 possible states of my model. If I tested all 1024 states against historical data, it is likely that some of them might do very well (depending on the general architecture of the model of course). What if I then selected the successful minority and held them up as clever strategies? Their success would very likely have been arbitrary. By basically brute-forcing enough strategies, I will inevitably come across some that were historically successful. But these same historically successful strategies are unlikely to outperform another random strategy in the future. It's not impossible you'll find a nugget of wisdom hidden from everyone else, just much less likely than the more simple explanation I'm offering.

So to your point, it's not just the size of the parameter space versus the data set that matters. Brute-forcing the former alone will likely produce a deceptive minority of winners.

1 comments

There is a fun chapter on this topic in Jordan Ellenberg's latest book "How not to be wrong". It's called the "Baltimore stockbroker fraud".