Hacker News new | ask | show | jobs
by radarsat1 3032 days ago
Whether you end up using a machine learning approach or hand-crafting the solution, I recommend you work in a ML-like manner, dividing up the data you have into test and training sets and using cross-validation to evaluate your work.

For you actual question, yes, as others have said it might be just an NLP/regexp problem. Otherwise, you could look at ingredients identification as a classification approach. I recommend checking FastText, NLTK, familiarize yourself with word dictionaries and pre-trained vectors that are available, these tools might help generalize your work beyond the data you have at hand.

(E.g. if it works well on your data using pre-trained word vectors from wikipedia, chances are it might work on examples you don't even have.)