Hacker News new | ask | show | jobs
by mrgalaxy 1456 days ago
I think I slightly misunderstood the point of this site, people probably do want fast updates. But I still think the price should be brought down. You could probably save costs by pooling the crawling, eg. if multiple subscribers request the same page, do that crawl once for all of them. You may already be doing that.
1 comments

RE: pooling - yup, pooling for the win! As a self taught programmer, this was one block of code I was particularly proud of :)
Crawling too fast from the same IP addy can get that IP blocked. Per 10 mins prob ok, per 1 minute may be too much depending on the volume being downloaded and the sensitivity (you mention saleable stuff in another comment so this might well make a difference). FYI. I did a reasonable amount of scraping myself.

(err, crawling = scraping? or maybe they different things)

Edit: tidy + clarify