| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by wraptile 1490 days ago

This is sad reality that web scraping is quickly becoming only accessible to people who can afford the captcha/computing/dev resources.

Identifying scrapers is actually really easy but it's not a binary decision. Anti scraping systems usually keep score that is compiled of few measurements so just applying some commonly known patches can improve your trust score significantly!

We recently published a blog series on all things that can be done to avoid blocking [1] request headers, proxies, TLS fingerprint, JS fingerprint etc but it's quite a bit of work to get there if you're new to web scraping - there's just so much information to get through and it's growing every day.

1 - https://scrapfly.io/blog/how-to-scrape-without-getting-block...