For me The Silver Searcher (Ag) has replaced trying to memorise various grep flags. Its defaults produce largely what I'm looking for without needing any flags.
On occasions where I do need more advanced features such as excluding file patterns or directories, I find the man page to be pretty easy to understand.
Wow this is amazing. It's super fast and has a very easy to follow help page, plus the formatting is pretty awesome. I just did a quick benchmark searching for `kthreadd` using grep and ag and it's magnitudes faster than grep. Thank you for sharing this. I hope I can convince the network admins at my workplace to install this tool.
In addition, A+ for the ag's design for `mmap()` ing files instead reading them into a buffer.
Bench marks:
Using Grep
```bash
$ time grep -Rsn "kthreadd"
...
real 0m25.341s
user 0m5.738s
sys 0m4.074s
```
> I just did a quick benchmark searching for `kthreadd` using grep and ag and it's magnitudes faster than grep.
I'd be careful with that. You may have just tested the awesomeness of linux caches :-)
Yes, it "looks" magnitudes faster:
]$ time grep -Rsn "kthreadd"
...
real 0m26.895s
user 0m3.131s
sys 0m3.151s
]$ time ag "kthreadd"
...
real 0m0.906s
user 0m1.930s
sys 0m1.186s
]$ time rg "kthreadd"
...
real 0m0.860s
user 0m1.756s
sys 0m1.123s
But wait, somehow grep got magnitutes faster too:
]$ time grep -Rsn "kthreadd"
...
real 0m2.961s
user 0m1.979s
sys 0m0.959s
And we can make both ag and rg slower:
]# echo 1 > /proc/sys/vm/drop_caches
]$ time ag "kthreadd"
...
real 0m17.761s
user 0m2.371s
sys 0m2.728s
]# echo 1 > /proc/sys/vm/drop_caches
]$ time rg "kthreadd"
...
real 0m19.276s
user 0m1.859s
sys 0m3.063s
I also default to use ag now and it is good at picking up things like .gitignores to exclude files. It works very well, I just wish it had a flag to limit the size of files searched because if a large file is found it can make the results harder to parse visually and most of the times when programming I only want to scan files that are under 100k (at most ... probably 10k is fine for a first pass).
This is really bad if the large file is some concatenated minimized javascript thing that is one massive line, then that really makes the results harder explore ( I tend to `ag foosearchterm | less -R` at that point) . Perhaps capping the line length in a result is another solution, if a line is over a couple hundred chars only show the relevant portion.
Thank you! I just tried it and --max-filesize worked great, ... -o also worked however it would be nice if -o showed a bit of context around the match (maybe 50 chars or so before and after), still useful all the same. Thanks again.
give ripgrep a try. it is probably better than anything else that you might have tried.
if you are using emacs, then rg (emacs interface to ripgrep) is just excellent. that when combined with embark and wgrep is almost magical in what it can do.
Doesn’t do me much good, though. I tend to use a GUI. I will, on occasion, use RegEx to look for stuff, but I generally just stare at code. I use Xcode, and Xcode has a lot of tools for doing things like going to where a method or property is defined (I use that, all the time). It also allows you to quickly see documentation on an entity. You can write code docs in a manner that generates this “quick look” documentation[0].
When I write code, I do so, in a manner that affords review in this manner (because I’m usually the poor schlub that has to look at my code).
Also, for me, there’s really no substitute for runtime analysis, like stepping through running code in a GUI debugger. If you want to trace execution thread, following it in realtime can't be beat.
I’m grateful to have that tool. I realize that it is not available to all; especially when working with a server-based, or embedded, codebase.
I usually write code for Apple systems, which means I have a great deal of control over the runtime environment, and lots of analysis tools. When I work on my server code, I stare at it a lot, and use tools like Charles Proxy and Postman to figure out what’s going on.
Would you know a GUI tool that would use the social network of the functions (who's calling who) to build a visual representation based on metrics of your choice? (ex: eigencentrality)
Ideally it would integrate with an editor to open it on the function you click, and allow other things that make sense to explore a codebase (ex: filter by filename to exclude some files and all the functions they contain)
I've been asked to look at some java code (not my favorite thing). There are about 8000 files, 150 of which seem more important. Having a premade graph to model their relations would let me read the 150 file in order of importance, starting with the most connected one, and maybe even stopping early once I've understood enough of the codebase.
Many years ago, when working on at a company with a huge legacy C/C++ codebase, one of the coworkers recommended I try out Source Insight[0]. After tweaking a few of the settings, it blew any other IDE out of the water. The ability to just click on a symbol and immediately see all the xrefs at the bottom was super helpful. IIRC it's also commonly used by security researchers to find their way through a new project.
I haven't used it, but having experience with IntelliJ IDEA and GoLand, they are slower and more cumbersome. In SourceInsight you click on any symbol and immediately you get xrefs, the function's source code (if it's a function), etc...
When I wrote C++ code, hands down Visual Studio + Visual Assist any day of the week. Its not that it had any unique features that are impossible to get anywhere else, its just the ease and simplicity of using a proper cohesive GUI toolset gave my productivity a big boost. And also how well it integrated into API documentation, etc. This was well before C++11 and all that jazz, so does anyone have any suggestions for modern C++?
I have used Visual Studio before and it's excellent for C++ type of projects, especially when combined with vim bindings and auto complete / auto suggestion features. Plus, Microsoft C++ Compiler (MSVC) is very well written and maintained, and at many times, I find the compiler warning to be very helpful. I wish I had the luxury of using Visual Studio at work.
It is, at least until the codebase grows so much that the IDE can no longer keep up with it.
(Or maybe it's just a problem specific to C++ projects and IDEs. The largest codebases I worked with, the ones where I hit this problem, were C++ ones.)
Yeah that can happen, though in my experience it's far more common that IDEs are broken by some janky custom build system set up by people that use Vim or Notepad++.
I think you're also right that some languages make this harder for IDEs than others. C++ is probably a worst case. Something like Java is trivial because the file structure matches the namespaces. It's probably not a coincidence that every Java IDE I've used has absolutely top class code intelligence tools.
Yeah I went from Java to JS and was surprised how much worse the code browsing experience is.
C is probably not as good as Java but I remember using Eclipse the first time with C for operating systems class, and being amazed with what a large codebase looks like and how easy it was to maneuver.
C++ contributes a lot tot his problem. IIRC you can't even reason about C++ code until you've parsed all of it. Undoubtedly, IDEs employ clever tricks and shortcuts, but even they can do only so much.
That's a mighty big assumption that someone is going to go to the trouble of setting up an IDE for a project they want to read the code of. I pull up code all the time that I have no intention of modifying just to understand how it works a bit better.
This was my though too, who cares what lines match where - when an IDE can take me to the line, and more importantly, show me the context for that line.
I've been tinkering with an AST based grep tool. (It only works for JS/TS right now). It allows you to search for code in a more free-form way and I hope better code-searching tools like this can come out. Because searching by string is really primitive. (pun intended)
Have been working on something similar, although my use case is more about learning how code has changed across git commits: https://github.com/bugout-dev/locust
For Javascript/Typescript/React support, like you, I hooked into the Babel toolchain. Can't recommend it highly enough.
There's also a newish project called quick-lint-js which seems to have written their own from-scratch AST parser for JS, but I haven't tried it yet: https://github.com/quick-lint/quick-lint-js
Finally, another project that I know in this space is comby (I believe it is owned/maintained by the folks at Sourcegraph): https://comby.dev/
Don't know why I dumped all those links there. Just figured there may be something useful in them for you. Am also just super passionate about building knowledge about code bases by analyzing their ASTs. Nice to meet a fellow enthusiast. :)
Thank you! It's amazing to see so much work going into tools like this :) I am a huge fan of AST and AST based tools. I think there is so much to explore in this space.
I've been toying with similar idea, also for JS/TS, but I'm struggling to come up with a search term syntax that'd feel even somewhat intuitive. Maybe "function foo" would match all functions named foo, including those defined with
const foo = () =>
and "var foo" would just match all values named foo, including functions? I don't know, I feel like there's lots of edge cases that might make this approach fall apart
I have a working prototype for a similar thing for PHP. But I abandoned it for now, as I have no time to improve on it and the query syntax is far from perfect.
Not quite - Using grep, I can go into a deep sub directory and search within just those files, which I don't think you can do from the find-in-files feature in VSCode.
There are lots of ripgrep recommendations in the comments, but an interesting alternative to grep specifically for code browsing might be weggli by my teammate Felix!
On occasions where I do need more advanced features such as excluding file patterns or directories, I find the man page to be pretty easy to understand.