Y
Hacker News
new
|
ask
|
show
|
jobs
by
navjordj
1160 days ago
The original T5 models were trained with masked language modelling similar to BERT and afterwards also trained on a autoregressive task similar to GPT.