Hacker News new | ask | show | jobs
by turtlebits 2472 days ago
Very cool, but super slow, especially for an API point, which I would expect you could use directly from a front end.

Tested on a site I regularly visit

  dashblock (3 selectors, ~20 items):  16.911 seconds
  curl (no scraping):  60 ms
  chrome:  987 ms

edit: added chrome
1 comments

Indeed, we are rendering the whole page with the javascript, that's why it takes longer than a curl. For now, it's especially useful for dynamic pages but we also plan on supporting pages that don't require rendering.
Maybe you already do it, but I think integrating adblocker functionality when loading JS sites would be desirable to reduce load time. And if ads are what the API user is interested in, perhaps add a flag for whether or not one wants ads to load.

Recommendation: https://github.com/cliqz-oss/adblocker Should be the fastest adblocker library (used by Ghostery, Cliqz and Brave)

Thanks for the advice, it makes a lot of sense !
Sounds good. It does make sense to check to see if your selectors work with raw HTML on publish to verify if you require JS or not.
Yep, that's our plan :)