Hacker News new | ask | show | jobs
by polymorph1sm 2089 days ago
I recently found this paper[1] claiming near GPT-3 performance with only a fraction of parameters. They seems to simply reformulate the input sequence to change classification to a sequence generation task.

Disclaimer, I am not affiliated to any of the authors

[1] https://arxiv.org/pdf/2009.07118.pdf