Hacker News new | ask | show | jobs
by naasking 1604 days ago
Anyone have experience with Playwright compared to Selenium? I have a fairly large test suite and Selenium produces constant false positive errors, typically due to various timeouts that seem fundamentally unsolvable when running it from .NET. It's just very finicky.

I don't know if it's Selenium specifically or some problem with the .NET binding, but I figure Microsoft must have better .NET integration so it will at least eliminate that possible source of problems.

6 comments

I'm not sure if any of these are pertinent to your tests, but these are the issues I see most often that cause flaky tests:

- Hard-coded waits in your code, like "Thread.sleep(1000)". A better alternative is to replace hard-coded waits with something that waits for an element or value to appear on the page. i.e. click on a button and wait for a 'Success' message to appear. Puppeteer and Playwright both have good constructs for doing this.

- Needless complexity in the tests. Conditionals in particular are a code-smell and indicate there's something needlessly complex about the test.

- No test data management strategy. The more assumptions you can make about the state of your application, the simpler your tests become. Ideally tests are running in an environment that nothing else is touching, and you're seeding data into that environment before tests run. I personally don't believe in mocking data in regression tests since that quickly becomes hard to manage.

We spend a lot of time thinking about these issues at my company and wrote a guide that covers other common regression testing issues in more detail here: https://reflect.run/regression-testing-guide/

> - Hard-coded waits in your code, like "Thread.sleep(1000)". A better alternative is to replace hard-coded waits with something that waits for an element or value to appear on the page.

We don't do any timed waits, all of our waits are for an element or value to appear, but these waits never complete sometimes, non-deterministically. We then added a long 5 min timeout on these waits because we know the test will never complete at that point. It's always fine in manual testing though, and if we don't run the browser in headless mode and watch it work. Very frustrating.

Sometimes the HTTP requests themselves timeout after a few minutes, but this never happens in manual testing either. That's actually the most common issue these days, and this happens non-deterministically too. This is what I meant by "flaky".

I wonder if the infrastructure that's driving the browser tests is underpowered. If the browser process is dying silently or CPU is getting maxed out, it could manifest in what you're describing where it happens intermittently and there's very little to go on. I'm assuming you're running these tests in Chrome... You could check the Chrome debug logs to see if anything is being spit out there.
Does Selenium have a trace log generator that will dump out all events? I.e. all element creation on the page, matching, etc.

I'm not familiar with it specifically, but that's my go-to starting place in weird automation issues like that. Normally it gives some kind of hint as to why that's happening (or why Selenium thinks it's happening).

While I have not used Playwright (but have a lot of experience with Se), I would say the code style is refreshing:

    // Expect an element "to be visible".
    await expect(page.locator('text=Learn more').first()).toBeVisible();
Writing await for every action makes the timeout of the action seem more explicitly declared. There seems to be more granular control of timeouts as well https://playwright.dev/docs/test-timeouts

> I don't know if it's Selenium specifically or some problem with the .NET binding

If the execution in .NET is slow then I suppose it could be .NET. But it could be (and often is) the suite design. You must wait for /everything/ before interacting with it because the code execution is quicker than the page.

Large Se/Webdriver suites are often a PIA. I find it's nice to write them with Python or Ruby so they can be debugged interactively with the an interactive shell.

> If the execution in .NET is slow then I suppose it could be .NET. But it could be (and often is) the suite design. You must wait for /everything/ before interacting with it

That's what I do, but the wait for an element in certain tests times out after a few minutes, even though the elements are clearly visible, and manual use never has an issue.

From other comments it sounds like Puppeteer and Playwright are better on this, so will look into switching.

When I fixed many similar selenium/webdriver tests the root cause was always the same: You grab reference to an element and for example wait it to become enabled or some text to appear. But your ui framework actually replaces the element in the dom while doing its thing and your reference to stale element will never change. Fix is to loop searching the element with selector and check if the element fills the conditions. If not, retry from search again. We had nice helpers for those and had very stable selenium tests.
> Fix is to loop searching the element with selector and check if the element fills the conditions. If not, retry from search again. We had nice helpers for those and had very stable selenium tests.

Thanks, I'll double check, but I think we do this now. In looking at the history of test failures, those failures are indeed less common, but still plenty of false positives of other types. Most persistent recent failures are the WebDriver timing out when loading a URL, which has never happened while manual testing or when being used by end users, so not sure what's going on there.

In any case, if the Playwright API encourages better idioms for writing tests that avoids these pitfalls, that would be cool because I deal with a lot of work term students that aren't adept at this kind of stuff so that would save a lot of headaches.

In my experience an other common issue is a race condition in the trigger, aka the system is not quite settled yet when an interaction is performed leading to that not registering.

This is more likely when the system is loaded / shared e.g. a CI Vm.

It’s instructive to watch a screencast / recording of UI tests, because you don’t necessary intuit how spazzy and fast the harness will perform its interactions.

I have had similar issues with selenium via other languages too - it is generally pretty flaky. E.g. saying a button or some other element doesn't exist when it clearly does.

With great care and effort you can make your tests reliable (especially if you are happy to allow a "best of 3" type test strategy to allow for 1 flake and 2 passes) though. Prodigious use of the wait (i.e. stdlib polling) primitives seems to give you the most bang for your buck.

I am note sure if this is just the nature of web automation, or if selenium is just crap? My gut is to say it is selenium's fault since we never get the same issues when using javascript in the DOM or in an extension)...maybe it is the browser APIs I guess? U have no idea but if this playwright is any better than that would be superb.

I think the issue here is Selenium or Playwright they depend on Selectors which depends on UI. And when there is a change it breaks the tests. We are working on something to generate you an adapting code (Cypress first) and let you know when there needs to be a change in your test script.

We have an AI model that understands the page structure as humans do. So we can do this "Click on 'Sign in' on the 'Login' page".

We have a no-code tool as well which adapts to the changes. But we want to generate the code for people who want to keep things internally.

Would love to discuss it more: m@preflight.com Our website: https://preflight.com

On paper Playwright should be a LOT better - it's taken a similar approach to Cypress, where everything is designed around the need to reduce flaky tests.

In Playwright that manifests itself as the "auto-wait" feature: https://playwright.dev/docs/actionability

You can do this kind of thing with Selenium too but it wasn't designed in from the very start of that project.

I think it's possible to write tests in selinium which are time-independant... Eg. "Wait for element #foo to exist".

You can also give the browser a virtual clock so that you can use time based timeouts and give every test a timeout of 3 hours, but those 3 hours only take milliseconds in real time. That approach gets CPU expensive if your site has any background polling scripts or animation, because obviously the animation will end up animating a lot during the test!

> I think it's possible to write tests in selinium which are time-independant... Eg. "Wait for element #foo to exist".

Yes, this is what I've done but the elements non-deterministically do or do not appear according to selenium, and then the wait times out. This happens for dozens of tests across dozens of pages with no issue with manual use, so something fishy is going on.

I tried selenium then playwright for a .Net project, selenium wasn't easy to work with. Playwright was good but for some reason which I don't recall exactly (could have been because it had to redownload chromium everytime we deployed). I ended up switching to puppeteer and I ended up very happy with it.
You can now generate puppeteer code from Google Chrome Recorder. You should check it out. But still it might be flaky.

No-code is the best in my opinion :D https://preflight.com

We have done all the ground work. Like: - Concurrency - Adapt to the changes. Our selectors are like this: "Click on 'Login' button in the 'Sign in' form" - Update the tests with an HTML/Video player etc