Hacker News new | ask | show | jobs
by pavel_lishin 5103 days ago
Why were you scraping? They offer a pretty good API.
1 comments

User comments, but going by users rather than threads. That way you could get a profile where someone posts, or turn it around and see what prolific posters existed in a given subreddit.

The thing is it wouldn't sweep everything. Instead a user would only get scraped if a request was made to my app, and I had a tool that would go through a request queue (storing to my own DB) in a metered way so that reddit only experienced a handful of requests from me per minute.

Nonetheless it still breaks robots.txt and if I could dig it up admins have said in the past that don't want automated/batched requests hitting their site.

Were you using the API, or just scraping HTML?