|
|
|
|
|
by simonw
1234 days ago
|
|
Yeah, the GitHub robots.txt is surprisingly restrictive: https://github.com/robots.txt User-agent: *
Disallow: /*/pulse
Disallow: /*/tree/
That "/*/tree" rule means that search engine crawlers are allowed to hit the README file of a repo but effectively NONE of the other files in it.Which means that if you keep your project documentation on GitHub in a docs/ folder it won't be indexed! You need to publish it to a separate site via GitHub Pages, or use https://readthedocs.org/ (Side note: I just noticed https://github.com/ekansa/Open-Context-Data is explicitly listed in the robots.txt for GitHub - the only repo that gets a mention like that. I'd love to know the story behind that!) |
|
Also, very relatable to see a decade old "I'll update this shortly" comment that was never updated. We all have a few of those.