Hacker News new | ask | show | jobs
by Vetch 1657 days ago
While you make sensible points, in the case of GPT-3, not everyone will be willing to route their data through OpenAI's servers.

> Just use DistilBERT uncased/cased (which is fast enough to run on consumer CPUs)

This can still be impractical, at least in my case of regularly needing to process hundreds of pages of text. Simpler systems can be much faster for an acceptable loss and you can get more robustness by working with label distributions instead of just picking argmax.

Fast simpler classifiers can also help decide where the more resource intensive models should focus attention.

Another reason for preprocessing is rule systems. Even if not glamorous to talk about, they still see heavy use in practical settings. While dependency parses are hard to make use of, shallow parses (chunking) and parts of speech data can be usefully fed into rule systems.