Hacker News new | ask | show | jobs
by setuid9002 975 days ago
Wait, so a simple web scraper script has to comply with robots.txt. But if I want to completely ignore the robots.txt, I only have to make my script more complicated (ChatGPT)?
4 comments

I'd like to consider this a difference between script action and user action.

For example if you make a web page a user pulls up that calls another webpage, is that a user action, a script action, a mix of both? I personally would consider it a user action.

And make it complicated enough (a human) and no .txt can stop them!
Yeah, it's like one weird trick for web scraping.
Nobody actually has to comply with robots.txt.