| Interesting, but I'm very skeptical. There are over a dozen transformers-based foundation time series model released in the past year and without fail, every one of them claims to be at or near SOTA. For example: - Time-LLM (https://arxiv.org/abs/2310.01728) - Lag-Llama (https://arxiv.org/abs/2310.08278) - UniTime (https://arxiv.org/abs/2310.09751) - TEMPO (https://arxiv.org/abs/2310.04948) - TimeGPT (https://arxiv.org/abs/2310.03589) - TimesFM (https://arxiv.org/html/2310.10688v2) - GPT4TS (https://arxiv.org/pdf/2308.08469.pdf) Yet not a SINGLE transformer-based model I've managed to successfully run has beaten gradient boosted tree models on my use case (economic forecasting). To be honest I believe these foundational models are all vastly overfit. There's basically only 2 benchmarking sets that are ever used in time series (the Monash set and the M-competition set), so it'd be easy to overtune a model just to perform well on these. I would love to see someone make a broader set of varied benchmarks and have an independent third party do these evaluations like with LLM leaderboards. Otherwise I assume all published benchmarks are 100% meaningless and gamed. |