Y
Hacker News
new
|
ask
|
show
|
jobs
by
SemanticStrengh
1499 days ago
I mean that information is being lost
https://arxiv.org/abs/1906.08237
See xlnet for the rethoric
https://www.microsoft.com/en-us/research/publication/mpnet-m...
Or mpnet which attempt to combine the best of both worlds information wise but still find that masked modeling is much less useful than autoregressive.