| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by motherwell 6232 days ago
	Just for clarrification: 1. robots.txt excludes CRAWLING e.g. downloading but NOT indexing, e.g. including a URL / site in a database of known URLs / sites. 2. Robots meta tag disallows INDEXING but NOT crawling. So it is semantically correct, although most modern SEs do not do this, to index a site / URL that is disallowed via robots.txt, using link data alone.