| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by anon373839 839 days ago
	Oh, right. If the high-level task is to generate a translation or summary, I think that’s been swallowed up by the Bitter Lesson (though isn’t it an open question if decoder-only models are the best fit? I’d like to see a T5 with the scale and pretraining that newer models have had). On the other hand, people seem to be using GPT-4 for simple text classification and entity extraction tasks that even a small BERT could do well at a fraction of the cost.