Hacker News new | ask | show | jobs
by corentin88 2344 days ago
Even if it’s just http(s) requests that’s a lot of data to find & crawl. The bandwidth costs are probably insane.
2 comments

I have a background in scraping from prior projects over the last decade.

Bandwidth is not a concern for projects like this at a lot of hosting/VPS providers.

Data ingress is usually free, which really cuts down on costs when scraping. If you can do everything in-memory, it's surprisingly cheap. The important bit is being respectful of robots.txt files and not overloading small sites with too many requests.