Hacker News new | ask | show | jobs
by PaulHoule 3502 days ago
If you're interested in commercialization you should start from day one with some estimate of the value the application creates. That is, "saves $X dollars" or "creates $X in revenue".

I do work in the natural language and item matching areas and in those cases I do what I call "preliminary evaluation" by working a small number of cases (say 10-20) in depth and putting together some story about what kind of outputs would be expected, what the actual requirements are, and what a decision process is going to have to take into account. You've got to put together a plausible story that the decision process exists.

For your case I would say the dog example is more feasible than the health care one. The caveat is what the negatives are like for the dog: are we looking at photos that have a lot of yellow and red? Are we looking at photos of dogs, etc? As for health care, prediction just adds to the health care boondoggle unless you can make the case of making a difference in outcomes and cost as opposed to just getting a better score at Kaggle.

In the case of text examples I'd say you want 10,000 examples of items in the class and at least that many out of it if you are doing a problem that bag-of-words is able to do to get results that you'd really be proud of. You might get that down to as little as 1,000 if some dimensional reduction is in use.

The center of my approach, when precision matters, is case-based reasoning, where you really find that there is one simple strategy that works say, 70% of time, and then a patch that gets you to 80% and then you keep adding exceptional cases to work up the asymtope. In a lot of cases like that you can establish a proof as to a lower bound of how accurate the results are and work up to handling more and more cases.

A core issue though is evaluating what matters, which is why I say follow the money. There is no better way to destroy evaluators than making them split hairs that don't matter.

1 comments

Interesting, first of all fully agree with you regarding the business side. Always need to have the customer, their problems, and the revenue model in mind. E.g. for the health care example I think there is a big upside from the insurer side to able to proactively identify illnesses and treat them early (typically cheaper than emergency care, but not always), not to talk about minimizing the human suffering.

But back to the actual question, rules of thumb to estimate feasibility of machine learning application without having access to a actual data set for the specific problem. Make sense to break it down in different problem domains as you mention, NLP, words, image classification, etc.

The 10,000 examples is for bag of words is something I will keep in mind going forward, thanks! When it comes to image classification I guess a fairly good benchmarks can be achieved by looking at available image datasets and public models built on top them (e.g. ImageNet and later versions) and then extrapolate on the precision and number of images needed to achieve it (assuming similar image datasets).

Anyone aware of other relevant rules of thumb for other problem domains?