Y
Hacker News
new
|
ask
|
show
|
jobs
by
staplung
1232 days ago
If they can't hit `/*/tree` is there a way to know the URLs of the files?
2 comments
pancrufty
1232 days ago
Direct links from crawlable pages
link
kadoban
1232 days ago
Sure, clone the git repo.
link
utopcell
1232 days ago
GitHub would not be happy with Google cloning all repos, and many of them at a high frequency, in order to circumvent a robots.txt restriction.
link
kadoban
1232 days ago
They're clever people, they could just do partial updates (pull instead of clone). I doubt it'd be that much of a strain.
link