|
|
|
|
|
by cmroanirgo
1389 days ago
|
|
Although I agree heartily with the idea of a push model for search engines, I can't help but notice that it seems to provide more centralisation to the search engines out there. Here on HN we've been seeing posts of alternate search engines. How will those small bespoke engines make use of IndexNow unless the website participates? The way I see IndexNow, I'll still get crawled relentlessly by the bots I don't want crawling my site (because robots.txt never seems to apply to them unless there's a special listing explicitly for them) So, unless you're a participating search engine, a website will still be getting crawled by low hanging fruit, not alleviating the problem. A good compromise would be something like an RSS feed, which a site can publish, and crawlers can hit for updated changes. It would also allow easier management for those domains that have many moving parts: individual search engines can be pinged, but the search engine just grabs the changes.xml file... Or something. |
|
There already is such an "RSS" feed, its called a sitemap available at /sitemap.xml or you can alternatively list your url in the robots.txt file