Temporal Fusion Transformers look very cool: accepts timeseries vectors but also categoricals and numerical features, outputs distribution using quintile regression.
SoTA in 2021 is the bespoke transformer model you implement yourself based on the idiosyncrasies in your data. Unless a lot of money is on the line, that is overkill for most situations though.
https://arxiv.org/pdf/1912.09363v2.pdf
Available as pytorch implementation:
https://pytorch-forecasting.readthedocs.io/en/latest/index.h...