Hacker News new | ask | show | jobs
by devsda 36 days ago
How do you distinguish Google/MS scraping for Gemini/Copilot vs Google Search/Bing? In the case of Google, the UA is the same and you are entirely at their mercy to honor the Google-Extended instructions in robots.txt

Google has further complicated it with new search announcement blurring lines between regular search and AI search. And AI likes to not honor any licenses or instructions when it is hungry for training material.

It is once again an example of Google using its dominant position to abuse and promote cross functional products.

2 comments

If company like Meta are downloading pirated books etc.. to train their AI, they will surely honor robots.txt.
I wouldn't be surprised if there isn't some sort of legal action against Google, the monopoly, to make the distinction in how their crawlers use scraped content.