Hacker News new | ask | show | jobs
by anon373839 840 days ago
I don’t think all of those tools have become obsolete. NER, for example, can be performed way more efficiently with spaCy than prompting a GPT-style model, and without hallucination.
1 comments

There was this assumption that for high level tasks you’ll need all of the low level preprocessing and that’s not the case.

For example, machine translation attempts were morphing the parse trees , document summarization was pruning the grammar trees etc.

I don’t know what your high level task is, but if it’s just collecting names then I can see how a specialized system works well. Although, the underlying model for this can also be a NN, having something like HMM or CRF turned out to be unnecessary.

Oh, right. If the high-level task is to generate a translation or summary, I think that’s been swallowed up by the Bitter Lesson (though isn’t it an open question if decoder-only models are the best fit? I’d like to see a T5 with the scale and pretraining that newer models have had).

On the other hand, people seem to be using GPT-4 for simple text classification and entity extraction tasks that even a small BERT could do well at a fraction of the cost.