Hacker News new | ask | show | jobs
by benawad 2189 days ago
It looks like you're using the recipe-scrapers library to scrape recipes which only supports a set number of websites.

If you want to expand that, I recommend parsing JSON+LD and Microformats. Given your parsers folder [2], it looks like you've tried it, but only for specific websites. I would make that generic and check whether the metadata is available on any website. I wrote a blog post on this if you're interested [3].

source: I've built a very similar tool for my cooking app: https://www.mysaffronapp.com/

[1] https://github.com/hhursev/recipe-scrapers

[2] https://github.com/poundifdef/plainoldrecipe/blob/master/par...

[3] https://www.benawad.com/scraping-recipe-websites/

7 comments

Howdy crap... I just created an account on your website and added one random recipe (cashew nut yoghurt) that did not work on the original post site, and it worked like a charm!

You've got a new paying customer :)

I'd been looking for something like your app for a long time.

Ough, your PayPal flow is not working :( fix that and you'll have a paying customer haha

I second this comment. While I am probably not going to pay for this yet (I dont have that many recipes), this site was able to scrape a recipe I cook often and put it into a format that is much better than the original blog post.

The scaling and editing recipe functionality is top notch. Ill probably use this tool now.

I'd recommend checking out https://whisk.com too.
Oops, what part of the flow isn't working?
https://ibb.co/kQrkbJV

(will autodelete in 30 mins) there's nothing very private but...

I'm using Paddle for payments and it looks like something wrong on their end. I'll contact them and keep you posted. Thanks for the heads up!
Thanks for sharing! I also want to make the JSON+LD stuff generic, but I have found that there are sometimes different renditions of that format. Though, now that I've looked at it, I only have 1 example of something non-standard, which doesn't include the @graph directive.

So that just requires some more research and testing. Perhaps someone enterprising will read this and make a pull request...

Saffron looks great, I had encountered it before building this for myself. Your blog post is quite illuminating - perhaps the first practical application of LCA that I've seen outside of an interview setting :)

> I recommend parsing JSON+LD and Microformats

It's a shame that Saffron provides neither on the published recipe pages. If I share a recipe with someone, they might want to import it into a different app.

I never considered that use case, is that something you've run into?
Yes, and happy as I am to recommend Saffron, they may already be used to, and happy with, an alternative.
Yeah makes sense, I'll add this to my todo list
I'm a paying customer with about 240 recipes imported. Saffron works very well overall and I recommend it, though I did have to do a lot of hand editing on some of the recipes.
Your app looks awesome. I've been thinking about a way of putting all my recipies in one place for a while, well, this looks like the kind of place I've been looking for.
This is improessive.. Not sure why I've never played with JSON+LD before.
Years ago, I wrote something like you describe in the blog (regex to match ingredient lines, looking for imperative verbs, filling in the gaps). Recently I revisited the subject and learned that almost everybody has decent jsonld data now. Even paywalled stuff.

Now I've got tampermonkey watching over my shoulder and backing up everything I look at to a couchdb instance. (Still gotta write some UI and an agent to pull down images, but I've got other irons in the fire at the moment.)