Hacker News new | ask | show | jobs
by ggreer 3582 days ago
I'm the author of ag. Pt has drawbacks that you may not be aware of. For example: pt is case-sensitive by default. (Ag defaults to smart-casing. If your query is all lowercase, it's case-insensitive.) If you tell pt to do a case-insensitive search, it kills performance. Here's an example:

    ggreer@lithium:~/code% time ag instantiationname
    ...
    ag instantiationname  3.82s user 2.16s system 157% cpu 3.797 total

    ggreer@lithium:~/code% time pt -i instantiationname
    ...
    pt -i instantiationname  136.65s user 0.77s system 773% cpu 17.761 total
That's to search a 20GB code directory. Both find the same matches. (Instances of "InstantiationName" in node/deps/gtest/include/gtest/gtest-param-test.h)

Another issue is that pt tends to bail on errors. If you tell it to follow symlinks (-f) and it encounters a single broken symlink, it will exit without finishing the search. Every other search tool I know of (grep, ag, ack) will keep on truckin'.

That said, pt is faster than ag for case-sensitive matches. It looks like much of that comes from a parallelized directory traversal. The architecture of ag is different. It has many worker threads, but only one thread going through dirs. Looks like I'll have to step-up my game. :)

2 comments

I found another pt performance issue. Searching an 8GB file:

    % time pt -i hello4294967296 big_file.txt
    big_file.txt
    134217728:hello4294967296
    
    pt -i hello4294967296 big_file.txt  535.86s user 1.35s system 100% cpu 8:56.97 total

That's right: pt takes 9 minutes to do a case-insensitive match on an 8GB file. On the same machine doing the same search, ag takes 7 seconds and grep takes 4s.

Also, it looks like pt was beating ag because it defaults to case-sensitive search. If I make ag do the same case-sensitive search, it's slightly faster in my benchmarks (0.63s to search my ~/code directory vs pt's 0.65s).

Thanks, original parent, never occurred to me to do case-insensitive searches, but I can see them being useful in some cases.

My main beef with ag was probably due to some older distribution or worse packager (tried it a few years ago), because I tried it again now from "brew" and it's pretty much just as convenient as pt -- and with some more features.