|
|
|
|
|
by Fordec
1811 days ago
|
|
Personally I find there is one important factor, and one factor alone. Context. Seasonality is the context that there's a seasonal driver at play. The context of a public holiday can explain a decrease in sales on that day.
The context of a football match can explain a spike in transport demand near a stadium.
The context of the presence of a heat dome predicted by pressure data can explain record temperature figures in Canada.
The context of reopening of schools explains a spike in Covid cases after months of decreases. The algorithm, is ultimately, not the deciding factor. The choice of what context you feed the algorithm as inputs is the real secret. Which leads to domain knowledge and an underfit linear regression beats a fancy algorithm trained on historical data of a single variable every time. Because the domain knowledge tells you what context to feed the model in the first place which is 90% of the battle. |
|
As an aside, the way this is handled is with a calendar like the https://nrf.com/resources/4-5-4-calendar
I wrote a time library once that handled calendars like this automagically, so you can roll the context in if you're clever enough.