Hacker News new | ask | show | jobs
by mtokunaga 3233 days ago
This type of decision might also impact Yelp or any others in similar businesses. Currently their API limits a top few reviews per business via their API, and also prohibits "scraping" of data in other means.

I was going to do some experiments with larger datasets from businesses in a region, but quickly found that's not possible.

3 comments

And Google search results as well. Search results are publicly accessible but if you try to crawl them, Google will block it.

If it becomes illegal to block crawlers, then Google is gonna get hammered with bot traffic.

It will also mean that google won't have to be the front-end to search results, and anyone can build on top of it, which could kill google ad revenue because then you could create anonymous google searches.

Google already isn't the only frontend to their search. Look at Startpage, for example.
You might try Apifier for that, we've recently scraped more than 150k reviews for 27k restaurants in London.

Here's a community crawler you can use: https://www.apifier.com/community/crawlers/Yonny/bcYqH-api-u...

Yelp has several public data sets available for research purposes. Might not be the region you were looking at, but for academic purposes, might be useful.