| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by staplung 1232 days ago
	If they can't hit `/*/tree` is there a way to know the URLs of the files?

2 comments

Direct links from crawlable pages

Sure, clone the git repo.

GitHub would not be happy with Google cloning all repos, and many of them at a high frequency, in order to circumvent a robots.txt restriction.

They're clever people, they could just do partial updates (pull instead of clone). I doubt it'd be that much of a strain.