Hacker News new | ask | show | jobs
by sunshadow 956 days ago
You don't use XPath&CSS selectors at all (Except if you dont have choice). You rely on more generic stuff, e.g, "the button that has 'Sign in' on it":

    await page.getByRole('button', { name: 'Sign in' }).click();
See playwright locators: https://playwright.dev/docs/locators
2 comments

I started putting data-testid attributes in my web app for automated testing using playwright. Prevents me from breaking my own script but it sure would make me more scrapable if anyone cared. Well.. I guess I only do it on inputs, not the rendered page which is what scrapers care most about.
Unless you start a war against scrapers, you don't need to worry about that as I'll always find a way to scrape your site as long as its valuable to 'me'. Even if it requires Real browser + OCR :)
Oh I know I couldn't prevent it. But if you wanted to scrape me, you'd have to pay the monthly subscription because everything is behind a pay wall/login. And then you'd only have access to data you entered because it's just that kind of app :-)
This is where you just train an LLM so you can write:

'get button named "sign in" and click'

Then on the back end, it generates your example code.

Adept is doing it.