Hacker News new | ask | show | jobs
by sdrg822 823 days ago
Dang they use Transformer-XL from 2019 haha - didn't realize people still used that / XLNet-like architectures