| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by gitgud 2059 days ago

It's great if each of these processes can be invoked separately, so that after the HTML is saved, you don't need to redownload it, unless the source has changed.

By dividing scraping into; rendering, caching and parsing you save your self a lot of web requests. This also helps prevent the website from triggering IP-blocking, DDOS protection and Rate-limiting.