Hacker News new | ask | show | jobs
by lucidrains 2402 days ago
Another work in the opposite direction, introducing gating in Transformer-xl https://arxiv.org/abs/1910.06764