Hacker News new | ask | show | jobs
by lukevp 2171 days ago
What are some domains that a solo developer could build something commercially compelling to capture some of this $37 trillion? Are there any workflows or tools or efficiencies that could be easily realized as a commercial offering that would not require massive man hours to implement?
6 comments

Take any domain that requires classification work that has not yet been targeted and make a run for it. You likely will be able to adapt one of the existing nets or even use transfer learning to outperform a human. That's the low hanging fruit.

For instance: quality control: abnormality detection (for instance: in medicine), agriculture (lots of movement there right now), parts inspection, assembly inspection, sorting and so on. There are more applications for this stuff than you might think at first glance, essentially if a toddler can do it and it is a job right now that's a good target.

> abnormality detection (for instance: in medicine), agriculture (lots of movement there right now), parts inspection, assembly inspection, sorting and so on

none of these is anything someone can run from their bedroom because they have very high quality and regulatory requirements and require constant work outside of the actual AI training.

This is actually reflected in the margins of "AI" companies, which are significantly lower than traditional SAAS businesses and require significantly more manpower to deal with the long tailed problems, which is where the AI fails but it's what actually matters.

Well, depending on the size of your bedroom ;) I've seen teams of two people running fairly impressive ML based stuff. They were good enough at it that they didn't remain at two people for very long but that was more than enough to be useful to others. One interesting company - that I'm free to talk about - did a nice one on e-commerce sites to help with risk management: spot fraudulent orders before they ship.

In the long term, and to stay competitive you will always have to get out of bed and go to work. But the initial push can easily be just a very low number of people engaging an otherwise dormant niche.

Yes, medicine has regulatory requirements. But as long as you advise rather than diagnose the regulatory requirements drop to almost nil.

anything that's even remotely profitable is already taken
This simply isn't true. Every year since the present day ML wave started has seen more and more domains tackled. Even something like that silly lego sorting machine I built could be the basis of a whole company pursuing sorting technology if you set your mind to it. And that's just resnet50 in disguise, likely you could do better today without any effort.

Your statement reminds of 'all the good domains are taken', which I've been hearing since 1996 or so. Of course you'll need to do some work to identify a niche that doesn't have a major player in it yet. But the 'boring' niches are where a lot of money is to be made, the sexy stuff (cancer, fruit sorting) is well covered. But more obscure things are still wide open, I get decks with some regularity about new players in very interesting spaces using thinly wrapped ML to do very profitable things.

Ah yes, of course. There will never be A new profitable ML startup until the end of time. Makes perfect sense.
People said the same thing about SaaS 5 years ago
You can give this article by Chip Huyen a read. Mayhaps you will find a niche for a solo or small dev team. Though it is focused on MLOps if that makes a different for the type of niche you're looking for.

https://huyenchip.com/2020/06/22/mlops.html

Extracting and selling data stuck in the mountain ranges of pdfs and other useless formats in every large corp, org, govt dept on the planet.

Do it for a couple publicly available docs and then contact the org saying you offer 'archive digitization' so their data ppl can mine for intelligence.

Most of the time and resources of 'Digital Transformation'/Data Science Depts goes to just manually extracting info from all kinds of old docs, pdfs, spreadsheets containing institutional knowledge.

You need to be creative. But one example - colorizing old photos: https://twitter.com/citnaj
The cost of training is decreasing, but the meaningfully large and non-trivial training sets are almost exclusively in the domain of large companies, economically inaccessible to the individual developers/startups.
This is a space I worked on during the crypto boom of 2017/2018.

The opportunity is present for a decentralized network that allows for training of models to be done from training sets at facilities.

Think of all the data sitting in silos from clinical trials. There is of course the painful process of authenticating researchers for access to data like that but it can be done. There just needs to be an economic reason to make that kind of effort.

I got pulled into a direction of using ML to predict costs of care in insurance so didn’t go further down the rabbit hole but I did author a patent for a novel approach to have a decentralized identity exchange data.

If any of this sounds exciting to you feel free to email me. hn (at) strapr (dot) com

krisp.ai but using gpu (also on mac) and with desktop version for ubuntu linux.