Hacker News new | ask | show | jobs
Effective Code Browsing (aknooh.github.io)
101 points by anooh 1718 days ago
12 comments

For me The Silver Searcher (Ag) has replaced trying to memorise various grep flags. Its defaults produce largely what I'm looking for without needing any flags.

On occasions where I do need more advanced features such as excluding file patterns or directories, I find the man page to be pretty easy to understand.

Wow this is amazing. It's super fast and has a very easy to follow help page, plus the formatting is pretty awesome. I just did a quick benchmark searching for `kthreadd` using grep and ag and it's magnitudes faster than grep. Thank you for sharing this. I hope I can convince the network admins at my workplace to install this tool.

In addition, A+ for the ag's design for `mmap()` ing files instead reading them into a buffer.

Bench marks: Using Grep

```bash $ time grep -Rsn "kthreadd" ... real 0m25.341s user 0m5.738s sys 0m4.074s ```

Using ag: ```bash $ time ag "kthreadd" ...

real 0m2.182s user 0m1.306s sys 0m1.773s ```

> I just did a quick benchmark searching for `kthreadd` using grep and ag and it's magnitudes faster than grep.

I'd be careful with that. You may have just tested the awesomeness of linux caches :-)

Yes, it "looks" magnitudes faster:

  ]$ time grep -Rsn "kthreadd"
  ...
  real    0m26.895s
  user    0m3.131s
  sys     0m3.151s

  ]$ time ag "kthreadd"
  ...
  real    0m0.906s
  user    0m1.930s
  sys     0m1.186s

  ]$ time rg "kthreadd"
  ...
  real    0m0.860s
  user    0m1.756s
  sys     0m1.123s
But wait, somehow grep got magnitutes faster too:

  ]$ time grep -Rsn "kthreadd"
  ...
  real    0m2.961s
  user    0m1.979s
  sys     0m0.959s
And we can make both ag and rg slower:

  ]# echo 1 > /proc/sys/vm/drop_caches
  ]$ time ag "kthreadd"
  ...
  real    0m17.761s
  user    0m2.371s
  sys     0m2.728s

  ]# echo 1 > /proc/sys/vm/drop_caches
  ]$ time rg "kthreadd"
  ...
  real    0m19.276s
  user    0m1.859s
  sys     0m3.063s
I'd suggest ripgrep [0] for even faster results.

[0] https://github.com/BurntSushi/ripgrep

ripgrep doesn't seem to just be fast, it has a good set of features compared to the alternatives too! https://beyondgrep.com/feature-comparison/
I always wondered what the differences between all those mysteriously named search tools are, great resource, thanks for sharing!
Might also want to use FZF as selection interface for the ripgrep results, if applicable.
It's Rust-based, though, and is thus difficult to get installed on older systems.
Which systems? I have a ten year old laptop it works fine on.
I also default to use ag now and it is good at picking up things like .gitignores to exclude files. It works very well, I just wish it had a flag to limit the size of files searched because if a large file is found it can make the results harder to parse visually and most of the times when programming I only want to scan files that are under 100k (at most ... probably 10k is fine for a first pass).

This is really bad if the large file is some concatenated minimized javascript thing that is one massive line, then that really makes the results harder explore ( I tend to `ag foosearchterm | less -R` at that point) . Perhaps capping the line length in a result is another solution, if a line is over a couple hundred chars only show the relevant portion.

If you use ripgrep, -o will only show the matching part of the line. It also has a --max-filesize flag.
Thank you! I just tried it and --max-filesize worked great, ... -o also worked however it would be nice if -o showed a bit of context around the match (maybe 50 chars or so before and after), still useful all the same. Thanks again.
give ripgrep a try. it is probably better than anything else that you might have tried.

if you are using emacs, then rg (emacs interface to ripgrep) is just excellent. that when combined with embark and wgrep is almost magical in what it can do.

Agree, I love Ag. My only grief is that it isn't standard and I still need to be able to use grep.
That’s great stuff.

Doesn’t do me much good, though. I tend to use a GUI. I will, on occasion, use RegEx to look for stuff, but I generally just stare at code. I use Xcode, and Xcode has a lot of tools for doing things like going to where a method or property is defined (I use that, all the time). It also allows you to quickly see documentation on an entity. You can write code docs in a manner that generates this “quick look” documentation[0].

When I write code, I do so, in a manner that affords review in this manner (because I’m usually the poor schlub that has to look at my code).

Also, for me, there’s really no substitute for runtime analysis, like stepping through running code in a GUI debugger. If you want to trace execution thread, following it in realtime can't be beat.

I’m grateful to have that tool. I realize that it is not available to all; especially when working with a server-based, or embedded, codebase.

I usually write code for Apple systems, which means I have a great deal of control over the runtime environment, and lots of analysis tools. When I work on my server code, I stare at it a lot, and use tools like Charles Proxy and Postman to figure out what’s going on.

[0] https://littlegreenviper.com/miscellany/leaving-a-legacy/

Would you know a GUI tool that would use the social network of the functions (who's calling who) to build a visual representation based on metrics of your choice? (ex: eigencentrality)

Ideally it would integrate with an editor to open it on the function you click, and allow other things that make sense to explore a codebase (ex: filter by filename to exclude some files and all the functions they contain)

I've been asked to look at some java code (not my favorite thing). There are about 8000 files, 150 of which seem more important. Having a premade graph to model their relations would let me read the 150 file in order of importance, starting with the most connected one, and maybe even stopping early once I've understood enough of the codebase.

You can likely find more through references here: https://www.nist.gov/itl/ssd/software-quality-group/static-a...
Source Insight has call graphs: https://www.sourceinsight.com/#call-graphs
NDepend seems to be rather sophisticated in this regard: https://blog.ndepend.com/visualize-code-with-software-archit...
There's also static analysis suites like Understand: https://www.scitools.com/features
I know those tools exist, but I haven't used them.

IBM's Rational Rose used to do something like that, but it was big buck$, and I don't think it exists, anymore.

Many years ago, when working on at a company with a huge legacy C/C++ codebase, one of the coworkers recommended I try out Source Insight[0]. After tweaking a few of the settings, it blew any other IDE out of the water. The ability to just click on a symbol and immediately see all the xrefs at the bottom was super helpful. IIRC it's also commonly used by security researchers to find their way through a new project.

[0] https://www.sourceinsight.com

I really like SourceTrail[0]. Unfortunately they've just announced that they're stopping development. [0] https://www.sourcetrail.com/
Can't you do that with CLion?
I haven't used it, but having experience with IntelliJ IDEA and GoLand, they are slower and more cumbersome. In SourceInsight you click on any symbol and immediately you get xrefs, the function's source code (if it's a function), etc...
When I wrote C++ code, hands down Visual Studio + Visual Assist any day of the week. Its not that it had any unique features that are impossible to get anywhere else, its just the ease and simplicity of using a proper cohesive GUI toolset gave my productivity a big boost. And also how well it integrated into API documentation, etc. This was well before C++11 and all that jazz, so does anyone have any suggestions for modern C++?
I have used Visual Studio before and it's excellent for C++ type of projects, especially when combined with vim bindings and auto complete / auto suggestion features. Plus, Microsoft C++ Compiler (MSVC) is very well written and maintained, and at many times, I find the compiler warning to be very helpful. I wish I had the luxury of using Visual Studio at work.
Modern Visual Studio should be good enough for most projects. Also I've heard good things about CLion.

I'd try VScode as well. Not sure if it's good, but it'll get better, that's for sure, it's extremely hyped project ATM.

Using a statically typed language and an IDE is about a billion times better than grep.
It is, at least until the codebase grows so much that the IDE can no longer keep up with it.

(Or maybe it's just a problem specific to C++ projects and IDEs. The largest codebases I worked with, the ones where I hit this problem, were C++ ones.)

Yeah that can happen, though in my experience it's far more common that IDEs are broken by some janky custom build system set up by people that use Vim or Notepad++.

I think you're also right that some languages make this harder for IDEs than others. C++ is probably a worst case. Something like Java is trivial because the file structure matches the namespaces. It's probably not a coincidence that every Java IDE I've used has absolutely top class code intelligence tools.

Yeah I went from Java to JS and was surprised how much worse the code browsing experience is.

C is probably not as good as Java but I remember using Eclipse the first time with C for operating systems class, and being amazed with what a large codebase looks like and how easy it was to maneuver.

Intellij's webstorm does js pretty well - almost as well as their java offerings.

Having said that, VSCode these days are also pretty damn good.

Yeah Typescript in VSCode is pretty good. I guess because the VSCode developers use it all the time!
C++ contributes a lot tot his problem. IIRC you can't even reason about C++ code until you've parsed all of it. Undoubtedly, IDEs employ clever tricks and shortcuts, but even they can do only so much.
That's a mighty big assumption that someone is going to go to the trouble of setting up an IDE for a project they want to read the code of. I pull up code all the time that I have no intention of modifying just to understand how it works a bit better.
I've found out that Github provides clickable Java identifiers. Like IDE. Incredibly convenient for your use-case, if it works.
This was my though too, who cares what lines match where - when an IDE can take me to the line, and more importantly, show me the context for that line.
This comes nowhere close to eliminating the utility of grep, especially in large code bases.

Grep helps you search for things in comments and in documentation. It allows you to conveniently ignore files and directories.

Grep and related tooling contain decades worth of insight and wisdom (in a decades-old field) that we should not be so quick to dismiss.

Concatenate all the code into a single file and use vim to regexp search and annotate it.

A good way to get up to speed on a complicated pile of C++ filth.

or....use: https://github.com/junegunn/fzf

example:

  fzf --height 60% --layout=reverse --border  --preview 'bat {}'
gives you a nice fuzzy search interface and uses bat for previews

  brew install fzf bat
This just searches filenames though. If you want to search the contents of the files, use something like ripgrep

https://github.com/junegunn/fzf#3-interactive-ripgrep-integr...

I've been tinkering with an AST based grep tool. (It only works for JS/TS right now). It allows you to search for code in a more free-form way and I hope better code-searching tools like this can come out. Because searching by string is really primitive. (pun intended)

Here is an example:

npx @nikhilverma/ast-grep '{ meta: { title: "___" }}' src/

Nice!

Have been working on something similar, although my use case is more about learning how code has changed across git commits: https://github.com/bugout-dev/locust

For Javascript/Typescript/React support, like you, I hooked into the Babel toolchain. Can't recommend it highly enough.

There's also a newish project called quick-lint-js which seems to have written their own from-scratch AST parser for JS, but I haven't tried it yet: https://github.com/quick-lint/quick-lint-js

Finally, another project that I know in this space is comby (I believe it is owned/maintained by the folks at Sourcegraph): https://comby.dev/

Don't know why I dumped all those links there. Just figured there may be something useful in them for you. Am also just super passionate about building knowledge about code bases by analyzing their ASTs. Nice to meet a fellow enthusiast. :)

Thank you! It's amazing to see so much work going into tools like this :) I am a huge fan of AST and AST based tools. I think there is so much to explore in this space.
I've been toying with similar idea, also for JS/TS, but I'm struggling to come up with a search term syntax that'd feel even somewhat intuitive. Maybe "function foo" would match all functions named foo, including those defined with

  const foo = () =>
and "var foo" would just match all values named foo, including functions? I don't know, I feel like there's lots of edge cases that might make this approach fall apart
I have a working prototype for a similar thing for PHP. But I abandoned it for now, as I have no time to improve on it and the query syntax is far from perfect.
You could check out Semmle. It is trying to solve a similar problem, but with code analysis.
VS Code is more effective. Jump around code just by clicking, get popups for documentation, ctrl-P instant find for files etc.
Not quite - Using grep, I can go into a deep sub directory and search within just those files, which I don't think you can do from the find-in-files feature in VSCode.
You can do this in VSCode with glob patterns to limit the search directory, e.g. "Find 'foobar' within './path/to/deep/directory/*/*.txt'"
There are lots of ripgrep recommendations in the comments, but an interesting alternative to grep specifically for code browsing might be weggli by my teammate Felix!

https://github.com/googleprojectzero/weggli

Title should be:

Quick introduction to grep

Depending on what you're doing, ctags can also be a big help.