| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by traK6Dcm 2245 days ago
	Data distribution shift. The market changes over time and your current data does not come from the same distribution as old data. That limits the amount of data you can use for training and testing. You need to be very careful not to overfit. That's especially true for something like daily or hourly data - there isn't much data to begin with and you won't have much left if you look at only a few weeks or months. Market data already has a low signal/noise ratio to begin with, so you need a good chunk of data to learn from. As you go to shorter time scales you get more usable data, but then you also need to deal with other issues such as latencies/jitter, market impact, complex order types, order book queues, etc. It becomes a different game.