Hacker News new | ask | show | jobs
by Ntrails 691 days ago
> There are just too few samples and too much noise per sample.

Call it 2000 liquid products on the US exchanges. Many years of data. Even if you approximate it down from per tick to 1 minutely, that doesn't feel like you're struggling for a large in sample period?

3 comments

It sounds like you are assuming the joint distribution of returns in the future is equal to that of the past, and assuming away potential time dependence.

These may be valid assumptions, but even if they are, "sample size" is always relative to between-sample unit variance, and that variance can be quite large for financial data. In some cases even infinite!

Regarding relativity of sample size, see e.g. this upcoming article: https://two-wrongs.com/sample-unit-engineering

They may have been referring to (for example) reported financial results or news events which are more infrequent/rare but may have outsized impact on market prices.
If the distribution changes enough, multiple years of data may as well be no data.