Y
Hacker News
new
|
ask
|
show
|
jobs
by
philkuz
1075 days ago
Caveat buried in the abstract is that this beats BERT and non-pretrained Transformers. Looks like GPT style should still be better, but naturally requires a higher computation cost
1 comments
jumpCastle
1075 days ago
Gzip every query with all training data can get more expensive.
link