|
|
|
|
|
by anon373839
839 days ago
|
|
Oh, right. If the high-level task is to generate a translation or summary, I think that’s been swallowed up by the Bitter Lesson (though isn’t it an open question if decoder-only models are the best fit? I’d like to see a T5 with the scale and pretraining that newer models have had). On the other hand, people seem to be using GPT-4 for simple text classification and entity extraction tasks that even a small BERT could do well at a fraction of the cost. |
|