Hacker News new | ask | show | jobs
by bubblematrix 1063 days ago
This honestly is standard web scraping but these projects always catch my attention.

You're bound at the mercy of rate-limiting firewalls (so you'll have to rotate proxies if you intend on using this heavily) on top of the standard CloudFront bot detection recaptcha, and div-obfuscation (a good example of this is Facebook).

1 comments

rss-Bridge has decent caching support, customisable on a bridge level, so that comes pre-tuned and works well at low volumes for personal use.

At large scale, like the kind of traffic I started seeing when I ran a public rss-bridge Instagram/Telegram bridge - rate limits are unavoidable.

That's been my experience too. Some of the bridges take into account the rate limits imposed by the platforms, and the steps required to get content without an API key.

So using RSS Bridge to generate feeds from large platforms is often a lot more reliable than the typical scraping script I'd code up myself for other sites.