| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by madchops1 3191 days ago
	I am doing my training/evaluation with a data split of 70/30. Doesn't that qualify as a proper backtest?

1 comments

soVeryTired 3191 days ago

I don't really know what you mean by evaluation. But you need to be able to (faithfully) generate all the positions your system would take through time, and also to generate all the returns you would have made through time.

Aside from pure P&L, you should be looking at how much risk your system is taking, and under what conditions it's doing badly. All backtests are overfit: their use is mostly in identifying problems with your strategy, rather than predicting how much money you'll make.

One question you'd get asked if you were proposing this in a real trading environment is this: what is it about the QM emini contract that makes this work? Does it work for other energy contracts? For other commodities? For bonds, or equities? If not, why not?

link

madchops1 3191 days ago

Basically I have a dataset and I train my model with 70% and then evaluate its guesses against the remaining 30%. Hence a baseline is created and I can see if my model performs better.

It took some doing to get this model to perform well. I did this by adding features that help recognize patterns in the time series data.

The features I created are not specific to QM as they are technical (eg. numbers, not news), and time-series related. So the models should work with any historical dataset with the same fields.

My goal is to add another future at some point.

link

soVeryTired 3191 days ago

I don't understand your baseline.

I feel like you're talking past me a little. The first thing you need to do is generate all the positions your system would have taken over as many years as possible, and figure out at what times you make and lose money. Otherwise you don't have a backtest.

link

madchops1 3188 days ago

I apologize. I can do that. I'm going to generate that backtest you described.

Right now I have residual data from the AWS machine learning data that tells me weather there is any structure to the times it does guess wrong. And a value below baseline is a better than 50/50 guess according to what I have learned about how AWS does its ML. Knowing that I use this personally as a supporting indicator to my trade decisions. Since its so new and I really don't want people to think I'm scamming or something. I'm just releasing my results free for now, not trying to be a douche ;)

AWS defines the baseline as follows

Baseline RMSE Amazon ML provides a baseline metric for regression models. It is the RMSE for a hypothetical regression model that would always predict the mean of the target as the answer. For example, if you were predicting the age of a house buyer and the mean age for the observations in your training data was 35, the baseline model would always predict the answer as 35. You would compare your ML model against this baseline to validate if your ML model is better than a ML model that predicts this constant answer.

link