Hacker News new | ask | show | jobs
by a1sabau 1967 days ago
https://github.com/get-set-fetch/scraper - I've been working (intermittently :) ) on a nodejs or browser extension scraper for the last 3 years, see the other projects under the get-set-fetch umbrella. Putting a lot more effort lately as I really want to do those Alexa top 1 million analysis like top js libraries, certificate authorities and so on. A few weeks back I've posted on Show:HN as you can do basic/intermediate? scraping with it.

Not capable of handling 1 mil+ pages as it still limited to puppeteer or playwright. Working on adding cheerio/jsdom support right now.