Hacker News new | ask | show | jobs
by jeroenhd 1452 days ago
You're right, it shouldn't. It's possible that they're fetching these URLs from their customers' browsing history and submitting those (external submissions follow different crawling rules, sometimes bypassing robots.txt). Bing's webmaster information says so, at least: https://www.bing.com/webmasters/help/webmasters-guidelines-3...

For a bit of added "fun", Google will do the same, but if you add a page to robots.txt and set noindex then they won't process the noindex parameter and external indexing sources might still generates search results: https://developers.google.com/search/docs/advanced/crawling/...