|
|
|
|
|
by selcuka
272 days ago
|
|
No, they explicitly block Gemini as well: User-agent: Google-Extended
Disallow: /
Gemini still uses the same user agent, but it has a different robots.txt entry (Google-Extended) [1]:> Google-Extended is a standalone product token that web publishers can use to manage whether content Google crawls from their sites may be used for training future generations of Gemini models that power Gemini Apps and Vertex AI API for Gemini and for grounding (providing content from the Google Search index to the model at prompt time to improve factuality and relevancy) in Gemini Apps and Grounding with Google Search on Vertex AI. [1] https://developers.google.com/search/docs/crawling-indexing/... |
|
I imagine many of the orgs that are blocking "training" don't understand the difference between training and inference-time tool-based context extension (which really needs an agreed upon name, it's hard to talk about right now).