Hacker News new | ask | show | jobs
by foobar__ 3580 days ago
How is it better?
1 comments

From their page:

It searches code about 3–5× faster than ack.

It searches code as fast as the_silver_searcher(ag).

It ignores file patterns from your .gitignore.

It searches UTF-8, EUC-JP and Shift_JIS files.

It provides binaries for multi platform (Mac OS X, Windows, Linux).

But even more so, sensible defaults that work nicely for code search use cases. And written in Go, without accumulated cruft for 200 platforms and POSIX subtleties like grep, so even someone like me can hack it to do something special if I want to.

Ag handles .gitignore as well. And has defaults appropriate for code searching. So the difference is official Windows support?
It's also faster (30% to 50% for my codebase), written in Go, and easier to extend.

  $ time ag zmq | wc -l
      63

  real   	0m0.017s
  user   	0m0.022s
  sys    	0m0.016s

  $ time pt zmq | wc -l
      296

  real   	0m0.013s
  user   	0m0.014s
  sys    	0m0.017s
(Timing differences consistent over multiple runs of both on the same codebase -- no disc cache effect) -- the wc difference is because of different "surrounding context" setting.
I'm the author of ag. Pt has drawbacks that you may not be aware of. For example: pt is case-sensitive by default. (Ag defaults to smart-casing. If your query is all lowercase, it's case-insensitive.) If you tell pt to do a case-insensitive search, it kills performance. Here's an example:

    ggreer@lithium:~/code% time ag instantiationname
    ...
    ag instantiationname  3.82s user 2.16s system 157% cpu 3.797 total

    ggreer@lithium:~/code% time pt -i instantiationname
    ...
    pt -i instantiationname  136.65s user 0.77s system 773% cpu 17.761 total
That's to search a 20GB code directory. Both find the same matches. (Instances of "InstantiationName" in node/deps/gtest/include/gtest/gtest-param-test.h)

Another issue is that pt tends to bail on errors. If you tell it to follow symlinks (-f) and it encounters a single broken symlink, it will exit without finishing the search. Every other search tool I know of (grep, ag, ack) will keep on truckin'.

That said, pt is faster than ag for case-sensitive matches. It looks like much of that comes from a parallelized directory traversal. The architecture of ag is different. It has many worker threads, but only one thread going through dirs. Looks like I'll have to step-up my game. :)

I found another pt performance issue. Searching an 8GB file:

    % time pt -i hello4294967296 big_file.txt
    big_file.txt
    134217728:hello4294967296
    
    pt -i hello4294967296 big_file.txt  535.86s user 1.35s system 100% cpu 8:56.97 total

That's right: pt takes 9 minutes to do a case-insensitive match on an 8GB file. On the same machine doing the same search, ag takes 7 seconds and grep takes 4s.

Also, it looks like pt was beating ag because it defaults to case-sensitive search. If I make ag do the same case-sensitive search, it's slightly faster in my benchmarks (0.63s to search my ~/code directory vs pt's 0.65s).

Thanks, original parent, never occurred to me to do case-insensitive searches, but I can see them being useful in some cases.

My main beef with ag was probably due to some older distribution or worse packager (tried it a few years ago), because I tried it again now from "brew" and it's pretty much just as convenient as pt -- and with some more features.