Hacker News new | ask | show | jobs
by abhgh 2490 days ago
I was going to mention XLNet before I saw your comment.

Also, a recent piece of interesting work [1] shows that with the right control parameters, you could still use gated RNNs, like LSTMs, for pretty good language modeling.

[1] http://www.abigailsee.com/2019/08/13/what-makes-a-good-conve...