Hacker News new | ask | show | jobs
by ThePhysicist 2739 days ago
I only partially agree. Building good ML models and even outperforming the ML services of the big players is absolutely feasible. Have e.g. a look at this talk from PyCon DE (in English: https://www.youtube.com/watch?v=XniwzOCWi2c), which shows how a small team built a machine vision system to read car registration numbers from official documents. The system was built and trained with an extremely small dataset (I think around 60 scanned documents with some data augmentation) and was able to easily beat the Google Cloud ML algorithm by an impressive margin (Google ML had an intolerably high error rate for this seemingly simple problem).

So I'd say if you have a very specific area that you're investigating you have a very good chance of beating larger players that don't specialize as much as you can. Of course competing against Google in self-driving cars or machine translation might be a bad idea, but even in those areas there are small startups that produce impressive results (e.g. DeepL: https://www.deepl.com/en/translator). Also, big companies regularly exaggerate their capabilities as well (sometimes more than startups), just have a look at how IBM markets their Watson AI/ML solutions, and what they deliver in reality.

So personally I'd say it has never been that easy to build relevant and interesting ML/AI based solutions as a small team, and it is possible to beat large players if you have the right approach and the right (very narrow) problem.

1 comments

DeepL is a very promising thing. I was very sceptic on the future of automatic translation seeing as Google Translate seems to have stagnated for the last two years or so, but I’ve just recently tried DeepL on a German newspaper article a couple of days ago and it did a very good job. Granted, I don’t know German (hence why I used DeepL) but nevertheless the English translation provided by DeepL seemed more polished than what Google Translate usually does.
I've used it a fair amount, and continue to be amazed with the quality it puts out. There are still some issues with formal pronouns, subject-matter-specific contractions etc, but otherwise it does a great job with both EN->DE and DE->EN