| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by dredmorbius 1087 days ago

The other option would be to front caches through ISPs and the like.

This works far better when the items requested are small in number but large in volume (that is: a large number of requests against a small set of origin resources). When dealing with widespread and deep scraping, other strategies might be necessary, but these aren't impossible to envision.

Specifically permitted scraping interfaces or APIs for large-volume data access would be another option.

Of course, there's the associated issue that data aggregation itself conveys insights and power, and there might be concerns amongst those who think they're providing incidental and low-volume access to records discovering that there's a wholesale trade occurring in the background (whether that's remunerated or free of charge).