Hacker News new | ask | show | jobs
Show HN: Which is faster? Puppeteer, Playwright or Selenium (colab.research.google.com)
6 points by eneuman 1016 days ago
Hey Everyone, I just ran a [rather silly] race between Puppeteer (JS), Playwright (Python) and Selenium (Python) to see which one would be fastest on a simple scrape (using Google Colab so you can also run it)

Far from a comprehensive benchmark, this race is 100% free from advanced configurations, multi-threading or anything complicated. It just opens Wallapop (a second hand marketplace in Spain) and times how long it takes to extract the first 2000 results of a search.

If you like this simple format, have any ideas on how to improve a race like this or have a strong urge to prove Ward Cunningham wright, let me know in the comments!

2 comments

    > Puppeteer (JS), Playwright (Python) and Selenium (Python)
Lots of questionable design choices here.

    - Why not all JS or all Python?
    - Why is the Playwright code scrolling?
    - Why is the Playwright code using explicit timeouts?
    - Why is the Playwright code using `evaluate` rather than `locators` and `click`?
- Language Choice (JS vs Python): Puppeteer in JS and Playwright in Python showed near-identical performance on an AWS c5.large instance. This negated the need to test Puppeteer and Playwright in the same language for this comparison.

- Playwright Scrolling: To emulate a user experience, all three tools employed infinite scrolling, which was necessary since Wallapop doesn't have pagination, you have to scroll to get results.

- Explicit Timeouts: Used for greater stability, especially when contending with network inconsistencies. Initially, I used API response events for triggering scrolls, but this approach was less reliable.

- Evaluate vs. Locators & Click: My initial tests indicated evaluate was marginally faster than locators and click.

I appreciate the scrutiny and I might include a JS vs Python comparison in a future test.

I think the approach misses the point. Playwright's auto-wait with `locators` is what makes it worth adopting because it means you don't need to use fixed waits. Auto-waits save much of the idle time waiting.
Man I love puppeteer