|
|
|
Ask HN: Reddit API vs. Browser Requests
|
|
3 points
by jerdthenerd
1095 days ago
|
|
I have been following the Reddit API saga quite closely, and I understand how/why Reddit as a company has incentive to effectively take 3rd Party Apps off the market. My question is, what is stopping someone from simply writing a web scraper that acts as if its a web browser and scrapes the actual subreddit(via reddit.com not api.reddit.com) and stores them in a local cache? I'm picturing an app that runs on a popular NAS software such as TrueNas, Synology, etc. So storage is not an issue. Is there a way for Reddit to detect that this isn't authentic traffic from an actual user? If the web scraper authenticates as a normal user, and respects the request throttling, wouldn't it just fly under the radar as a particularly addicted user? |
|
Not to mention the sheer amount of content you'd have to scrape, which would definitely surpass "normal" user engagement.