Hacker News new | ask | show | jobs
by bitL 2411 days ago
e.g. XLNet:

https://arxiv.org/abs/1906.08237

1 comments

XLnet is Bert with a bunch of additional training tricks.
BERT is a Transformer with a bunch of additional training tricks. Transformer is self-attention with a bunch of additional training tricks...