I use the Mozilla DXR project (or the searchfox.org fork) every day, which is pretty great for code.
Not only can it quickly search across large codebases, it parses JS/Python and the output of clang (for C/C++) to allow quickly finding the definitions of functions, declarations of variables, and so on (try hovering over a variable for instance):
Nothing that many mainstream IDEs can't do, but having it on the web and being able to quickly link people and not requiring local setup helps tremendously to get people up to speed quickly.
I might just be searching for the wrong terms, but am I understanding that this is specific to searching in the mozilla codebase? Is there a version that can work on arbitrary codebases?
I would love to be able to search all code for a string and then either (1) sort the resulting repositories by stars/forks; or (2) limit the results to repositories with >X stars/forks. When learning a new framework or library I like to find popular projects that use it and read the code to get a sense of conventions, architecture, etc. For instance, it'd be fantastic to find all repositories with over 20 stars containing a *.py file with "import flask" or "from flask" in them.
Unfortunately, you can search code by file extension and phrase, and you can use advanced search to search for repository descriptions filtering by stars, but I don't believe you can do both at once.
For instance, searching for "flask" and limiting the results to >1000 stars returns only the 27 repositories with a matching description[0], but the code search returns over 4 million results, ignoring the stars parameter[1].
How would you build the search results UI for a grouped query like this? If one repository has 10k stars and has 1000 files with matching strings, should the first 1000 results be from the same repository?
github should try to compile code. Where that succeeds, it will give them full type information for every variable, and information on every function call, just like a good IDE has when doing code completion.
With that info, they would be able to build an awesome search system.
Gitlab manages to do CI (partnering with DigitalOcean), which managed to get me at least partially switched, and has further potential for upsell (have more CI servers! with exotic configurations!)
Not all languages which could significantly benefit from pulling out type info even compile. Something significantly smaller scoped would be to have github's search aware of and consume some kind of intellisense-esque database or structured documentation format that any CI process could output. (Of course, someone needs to write the tools to generate said output in the first place...)
Not only can it quickly search across large codebases, it parses JS/Python and the output of clang (for C/C++) to allow quickly finding the definitions of functions, declarations of variables, and so on (try hovering over a variable for instance):
https://dxr.mozilla.org/mozilla-central/source/browser/compo... https://dxr.mozilla.org/mozilla-central/source/toolkit/mozap...
Nothing that many mainstream IDEs can't do, but having it on the web and being able to quickly link people and not requiring local setup helps tremendously to get people up to speed quickly.