Hacker News new | ask | show | jobs
by adjkant 2934 days ago
I've actually run a trading algorithm off a very similar approach for the past 8-10 months, which yielded about 87% return in that time. Using regression I can still to this day hit about 55-60% accuracy. What's not mentioned in the paper is that accuracy is only a small part of the story. If you're accurate 60% of the time but in the 40% accuracy range you're very wrong, acting on the information is useless.

As a result, it's important to develop a trading approach that can actually capitalize on the information. For that, I have found three things to work best:

1. Only trading on the highest signals of increase within a model that is a spectrum rather than binary classification. This usually doesn't increase accuracy much ironically but does increase the "average value" of buying on the increase signals. I usually set this through historical testing for prediction values and taking a top percentage of the prediction values to set the "threshold".

2. More features and feature selection tuning. Right now I'm using genetic algorithms to constantly try and test new sets of features, thresholds, "hold times" after buying, etc.

3. Work on minutes, not hours. The volatility is so high that you can actually capitalize well on the micro level in my experience.

While accuracy is important, the average trade value and trades per day are far more important to returns.

Interestingly enough, the algorithm was steadily making money until April or so, when it stagnated. Mind you, it was making money from January-March due to sheer volatility even while the price was dropping most days. I've actually shut mine down for two reasons - the plateau plus the fact that the market was too thin on GDAX to quickly trade on buy signals for the amount I was running with (ending at about $3.5K). If the market thickens, I'll likely start running it again.

Takeaway: this paper's approach may seem simple but honestly the reality is that with something so volatile it's surprisingly easy to capitalize on with algorithmic trading that learns even a few small features and trades frequently.

2 comments

"More features and feature selection tuning."

This sounds an awful lot like curve fitting.

Very interesting.

Are you buying and selling, or are you shorting too?

Does this 60% accuracy remain if you change the training/prediction period? I'd guess that it would be more accurate if you train based on days versus seconds.

Just buying and selling, all GDAX maker to avoid any fees which would more or less cancel out even the best model I have. Lack of fees is another unique feature of the space.

The training period I keep to 3 months and haven't really moved much since initially trying things. A month or less and the model is overfit and useless, too long and it's not working with current data.

The prediction/"hold time" changes absolutely make a difference. I was running on litecoin and found about an hour to be the sweet spot.

> days versus seconds

Seconds would be useless because you can't trade that fast - minutes is what I use.

Even if accuracy increases with a hold time over days, the average trade value doesn't go up nearly enough to make up for the trade frequency of the minutes/hours level. Why make 3 trades per week with an average value of .5% when you can make 4 trades a day for an average value of .15%? The compounding of that frequency works wonders, and that 4 trades a day for .15% is what I was actually hitting for a few months.

For the record, I do also compare to both naive buy/hold over the training period and the average trade value for the period for all times (different than the buy/hold time because I have a profit lock-in threshold for individual trades, also tuned with genetic algorithms), and the model outperforms both still.

The model is still predicting positively but the average trade value is shrinking + market thinning hence why it was breaking even recently until shut down.

That's great. Do you use any other baselines, e.g. random daily positions? I'm guessing that your method soundly beat the 'buy and hold' baseline over the last 5 months.

Have you done any work on trying to predict breakouts or crashes? I often think how uncanny it is to compare market movements to simply the number of comments on Reddit threads, or numbers of tweets. I don't know if these are leading or lagging indicators though.

No other baselines (I'm only going for practical not theoretical or publishing so no need really).

Just market features but I'm sure other features are out there. At the minutes level, predicting crashes doesn't do much good as often crashes have ideal 1 hour buying windows so the macro crash/spikes don't really matter, and if they did, would be caputed by the model still.