Hacker News new | ask | show | jobs
by vivekseth 1156 days ago
Yeah it has not been trivial so far! Let me know if you have any tips you can share
2 comments

I settled on a regex based approach with lots of data clean-up and normalisation up front. Example of my site [4]

Some other approaches I spent a lot of time on:

* Extracting Structured Data From Recipes Using Conditional Random Fields [1]

* Chef Watson [2]

* Ingredient Parser - Model Guide [3]

[1] https://archive.nytimes.com/open.blogs.nytimes.com/2015/04/0...

[2] https://blog.kitchenpc.com/2011/07/06/chef-watson/

[3] https://ingredient-parser.readthedocs.io/en/latest/guide/ind...

[4] https://pretty-recip.es/recipe?recipe-url=https%3A%2F%2Fwww....

Maybe GPT-4 or some other LLM? Maybe too expensive but I would think they'd be able to accomplish the technical task.