| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by adamwi 3543 days ago

Interesting, first of all fully agree with you regarding the business side. Always need to have the customer, their problems, and the revenue model in mind. E.g. for the health care example I think there is a big upside from the insurer side to able to proactively identify illnesses and treat them early (typically cheaper than emergency care, but not always), not to talk about minimizing the human suffering.

But back to the actual question, rules of thumb to estimate feasibility of machine learning application without having access to a actual data set for the specific problem. Make sense to break it down in different problem domains as you mention, NLP, words, image classification, etc.

The 10,000 examples is for bag of words is something I will keep in mind going forward, thanks! When it comes to image classification I guess a fairly good benchmarks can be achieved by looking at available image datasets and public models built on top them (e.g. ImageNet and later versions) and then extrapolate on the precision and number of images needed to achieve it (assuming similar image datasets).

Anyone aware of other relevant rules of thumb for other problem domains?