Hacker News new | ask | show | jobs
by simonw 1041 days ago
Yeah, that's good idea - I need to add that to my suggestions for how to implement this.
2 comments

If you're scraping any significant amount of data (>500K), and depending on the frequency, you might also want to add etag/cache-control headers as well as accept-encoding, to save server bandwidth.

Collecting 1 kB every minute might not be a big deal, but collecting 1 MB every minute would cost an AWS-hosted service >$40/year in additional data transfer costs

It should definitely be optional. I can only imagine some busybody PM insisting they block harmless scrapes.