Hacker News new | ask | show | jobs
by bionsystem 721 days ago
I have this idea that we only have one universe of historical financial data, and it is only 500 years long, which is ridiculously small. So backtesting and drawing conclusions is highly overrated.

Another thing as you said is that it's hard to get quality data. For example most databases don't include price history for bankrupt companies (or miss quite a bunch), which makes some quantitative strategies like focusing on low PE and PB for example, completely bogus. Which is sad because most books will actually tell you to do that, without ever talking about how many of those backtests lack companies with -100% return in their virtual portfolios. Those tend to be low PE companies that the market consider risky, and it was right, but because they disappeared, the strategies outperform because they ignore so many losers.

3 comments

To develop a trading EDGE. You're looking for a market inefficiency. The model is not dependent on perfect data or even accurate data. The model is even tested with having random prices... by using a FILTER to see if it still holds.

Then, you're going to paper trade it. Then live trade it. Historical data can only give you a directional indicator... is your 'thesis' of market inefficiency... directionally accurate.

I don't think we have 500 years of data. Anything before 1926 is pretty sketchy.
Indeed, I pulled 500 out of my hat because that's roughly stock market history. But that kind of reinforces my point, if we only have 100 years that's very little, especially considering how fast things change. Early 20th centuries very few industries were publicly traded, even amongst those that existed at the time. And even beyond the industries, other things have changed a lot like regulation, taxes, accounting rules, management style... surely the market takes all of those into account, one way or another.
Yeah, the thing that makes trading and investing different from most other disciplines is that the distributions are completely non-stationary and are changing all the time. There are some "stylized facts" (that's the term to search for) so use those to at least ground your model but you won't make any money from that.
Yeah if you want good data, start collecting it now. I believe anyway the magic is outside the numbers. They are only a shadow on a cave's wall.
I agree, there is a qualitative statement to be made. AI can help especially summarizing text like news and earnings calls, but there is quite a bit of human work to be made beyond just running some software.