|
|
|
|
|
by polymorph1sm
2089 days ago
|
|
I recently found this paper[1] claiming near GPT-3 performance with only a fraction of parameters. They seems to simply reformulate the input sequence to change classification to a sequence generation task. Disclaimer, I am not affiliated to any of the authors [1] https://arxiv.org/pdf/2009.07118.pdf |
|