Hacker News new | ask | show | jobs
by vivekseth 1161 days ago
https://masala-joy.vercel.app/

I scraped ~100k recipes from across the internet, and made this site to focus on south asian recipes (~2k). I will soon add features to better sort and filter these recipes by various diets, ingredients, and regions in South Asia.

I just launched a few days ago, and no revenue yet.

1 comments

Have fun going down the rabbit whole of parsing structured data from ingredient phrases. Had a fun few weeks with that!
Yeah it has not been trivial so far! Let me know if you have any tips you can share
I settled on a regex based approach with lots of data clean-up and normalisation up front. Example of my site [4]

Some other approaches I spent a lot of time on:

* Extracting Structured Data From Recipes Using Conditional Random Fields [1]

* Chef Watson [2]

* Ingredient Parser - Model Guide [3]

[1] https://archive.nytimes.com/open.blogs.nytimes.com/2015/04/0...

[2] https://blog.kitchenpc.com/2011/07/06/chef-watson/

[3] https://ingredient-parser.readthedocs.io/en/latest/guide/ind...

[4] https://pretty-recip.es/recipe?recipe-url=https%3A%2F%2Fwww....

Maybe GPT-4 or some other LLM? Maybe too expensive but I would think they'd be able to accomplish the technical task.