|
|
|
|
|
by 995533
2727 days ago
|
|
Nobody cares how long it takes to train a model. What matters is prediction speeds, which are comparable (and NLP less likely to require high frequency, where a few more milliseconds matters). Besides that, the accuracy gains are not marginal anymore (BoW can't compete like it used to, especially with pre-trained models). |
|
This isn't true. It depends on your priorities and goals. Machine learning that spends most of its time unable to learn is not real AI. Some of us are interested in sample and energy efficient learning capable of on-line incremental updates immune to catastrophic forgetting. Not just because this is truer to actual learning but because it moves away from being dependent on a handful of companies to do the actual training.
Anticipating some replies: no, transfer learning or meta-learning methods don't really avoid this. In the case of transfer learning, you still have that high coupling between a handful of sources. The down-sides of this is its own discussion. In addition, there are times where the ability to extract local relations can be dulled by the dominant wikipedia and common-crawl representations. Meta-learning gets you fast updates but you still cannot stray too far away from the domains that were met at training time.
> What matters is prediction speeds
I'm not a fan of bag of words models either but a simple dot product is always going to be faster than many matrix multiplies and or convolutions. The implementor should always try these as a base-line and decide if the performance accuracy trade-off is worth it for them.