Hacker News new | ask | show | jobs
by billconan 2115 days ago
same here.

there are 2 approaches they use that make developing bots very difficult.

1. they detect device input. if there is no mouse movement, while the website is being loaded, they will consider it's a bot.

2. they detect the order of page visiting. A human visitor will not enumerate all paths, instead, they follow certain patterns. This is detectable with their machine learning model.

I really don't have a solution for #2

1 comments

I think the solution is "hybrid" scraping with a human driving the clicks and the scraper passively collecting the data.

If you record, you can probably teach AI to emulate.

I love this. I might try it. It doesn't scale, but that's okay for my project.