Hacker News new | ask | show | jobs
by chii 542 days ago
> If you want people to not scrape, offer API’s

many sites want to prevent scrapers because they don't want their information aggregated - things like price lists and product availability etc.

I know groceries sites do this, to prevent customers from knowing price histories of products. They want to raise prices, then offer a discount to make it seem like the discount is legitimate.

1 comments

On the topic of scraping grocery sites, here's an example of bypassing bot-detection on Albertsons: https://github.com/seleniumbase/SeleniumBase/blob/master/exa... (A demo of that is in https://www.youtube.com/watch?v=Mr90iQmNsKM)
It seems weird to me that works - when I do scroll into views and similar behaviors in other code I do a random scroll speed to simulate human behavior, but SeleniumBase evidently doesn't.

Maybe I am just too paranoid.

SeleniumBase CDP Mode uses `DOM.scrollIntoViewIfNeeded` (https://chromedevtools.github.io/devtools-protocol/tot/DOM/#...), so it only scrolls when elements are offscreen, rather than always scrolling. This reduces the number of scrolls needed. Also, it seems that most anti-bot services are not looking at scrolling as a way of identifying users.