Hacker News new | ask | show | jobs
by 0x008 1812 days ago
My experience in general is that most time series model are inadequate in predicting time series except for very trivial cases of seasonalities or simple linear/nonlinear trends.

I think that you can throw any model you like at the problem but all you will do is overfit most of the time.

2 comments

Personally I find there is one important factor, and one factor alone.

Context.

Seasonality is the context that there's a seasonal driver at play.

The context of a public holiday can explain a decrease in sales on that day. The context of a football match can explain a spike in transport demand near a stadium. The context of the presence of a heat dome predicted by pressure data can explain record temperature figures in Canada. The context of reopening of schools explains a spike in Covid cases after months of decreases.

The algorithm, is ultimately, not the deciding factor. The choice of what context you feed the algorithm as inputs is the real secret. Which leads to domain knowledge and an underfit linear regression beats a fancy algorithm trained on historical data of a single variable every time. Because the domain knowledge tells you what context to feed the model in the first place which is 90% of the battle.

> The context of a public holiday can explain a decrease in sales on that day.

As an aside, the way this is handled is with a calendar like the https://nrf.com/resources/4-5-4-calendar

I wrote a time library once that handled calendars like this automagically, so you can roll the context in if you're clever enough.

Is there also "one factor alone" in predicting stock market performance? Foreign-exchange rates? Lottery numbers?
Stock market is anti-inductive: future performance takes into account your attempts at predicting it. Any regularity in the stop market disappears the moment someone spots it and starts trading on it.
Yeah models assume future data can be predicted with past data, which of course is not always true in real life due fundamental limitations of statistics, such as very fat tailed distributions in stock markets.

Competent data scientists must be able to spot those cases quickly.

Predicting stock marketperformance, i.e. performance of individual equities ?

The TL;DR answer is "no, but...".

No because by definition there is a high probability for individual equities to display idiosyncratic behaviour. Why ? Because we are, afterall, talking about individual companies. So their stockmarket performance is inherently tied to their corporate financial performance, their corporate prospects and how investors feel about all that jazz.

The "but" comes because there are, as always, exceptions to the rule.

You can, for example, engage in momentum trading. That should be (reasonably !) simple to model with a few inputs.

Otherwise, at the other end of the complexity spectrum, you can build a model to identify stocks that are in a macro regime. When stocks are in a macro regime it means that they are behaving as a proxy for macroeconomics instead of the individual usual corporate measures. This means you can build your model based on real quantitative measures (i.e. suitable macro factors) instead of trying to second guess idiosyncratic stock behaviour. The only real downside is that you will need access to quality macro data feeds, so if you are thinking of doing this as a retail investor (i.e. private individual) you might find yourself falling at the first hurdle.

This is to be expected, because pure time-series models (Holt-Winters, ARIMA, etc.) only capture behavior of historical data (autoregressive, i.e. yₖ = f(yₖ₋₁, yₖ₋₂, ...)). If the patterns of interest aren't primarily time-based patterns, then time series models wouldn't be predictive.

In my experience, the time-series models that are reliably predictive typically aren't purely autoregressive but contain exogeneous variables as well (i.e. yₖ = f(yₖ₋₁, yₖ₋₂, ..., xₖ, xₖ₋₁, xₖ₋₂...), like ARX models). These models don't only capture relationships to historical patterns but to other driving/causal variables.

Price forecasts are often modeled as time-series models, but this assumes that prices only have time-based patterns which is often not true. In my domains of interest for instance, time has tangible yet limited effect on prices -- prices are driven more by variables like weather and certain types of market activity.

totally agree. there is more than enough research published that confirms that simply predicting the last value is on average only marginally worse than most time series models.