|
|
|
|
|
by whimsicalism
2081 days ago
|
|
I was sloppy in my skimming of the paper - upon closer read it does actually seem quite different than that literature I mentioned (examples: RoBERTa, XLNet). I'll be reading it more carefully, but can now better understand the comparison to GPT-3. |
|