Hacker News new | ask | show | jobs
by GalacticDomin8r 2635 days ago
It would help if you tested just grep when benchmarking grep. These datapoints tell a much different story.

  # /usr/bin/grep -V
  grep (BSD grep) 2.6.0-FreeBSD
  
  root@m6600:~ # /usr/local/bin/grep -V
  grep (GNU grep) 3.3
  Copyright (C) 2018 Free Software Foundation, Inc.
  License GPLv3+: GNU GPL version 3 or later <https://gnu.org/licenses/gpl.html>.
  This is free software: you are free to change and redistribute it.
  There is NO WARRANTY, to the extent permitted by law.
  
  Written by Mike Haertel and others; see
  <https://git.sv.gnu.org/cgit/grep.git/tree/AUTHORS>.
  
  root@m6600:~ # /usr/bin/time /usr/bin/grep X-User-Agent packetdump.pcap -c
  60
          0.54 real         0.45 user         0.07 sys
  root@m6600:~ # /usr/bin/time /usr/bin/grep X-User-Agent packetdump.pcap -c
  60
          0.54 real         0.44 user         0.08 sys
  root@m6600:~ # /usr/bin/time /usr/bin/grep X-User-Agent packetdump.pcap -c
  60
        0.54 real         0.41 user         0.11 sys
  root@m6600:~ # /usr/bin/time /usr/local/bin/grep X-User-Agent packetdump.pcap -c
  60
          0.58 real         0.49 user         0.08 sys
  root@m6600:~ # /usr/bin/time /usr/local/bin/grep X-User-Agent packetdump.pcap -c
  60
          0.60 real         0.48 user         0.11 sys
  root@m6600:~ # /usr/bin/time /usr/local/bin/grep X-User-Agent packetdump.pcap -c
  60
          0.59 real         0.50 user         0.08 sys
  root@m6600:~ # du -h -s packetdump.pcap
  225M packetdump.pcap
1 comments

That is a very good point. Taking this better approach, here is what I get on my (not updated grep) system:

    wgl:$ /usr/bin/grep --version
    /usr/bin/grep --version
    grep (BSD grep) 2.5.1-FreeBSD
    
    wgl:$ /usr/local/bin/ggrep --version
    /usr/local/bin/ggrep --version
    ggrep (GNU grep) 3.3
    Packaged by Homebrew
    Copyright (C) 2018 Free Software Foundation, Inc.
    License GPLv3+: GNU GPL version 3 or later <https://gnu.org/licenses/gpl.html>.
    This is free software: you are free to change and redistribute it.
    There is NO WARRANTY, to the extent permitted by law.
    
    Written by Mike Haertel and others; see
    <https://git.sv.gnu.org/cgit/grep.git/tree/AUTHORS>.
    
    wgl:$ /usr/bin/time /usr/local/bin/ggrep LiteonTe really-big.text | wc -l
    /usr/bin/time /usr/local/bin/ggrep LiteonTe really-big.text | wc -l
            2.30 real         1.04 user         0.67 sys
        1228
    wgl:$ /usr/bin/time /usr/bin/grep LiteonTe really-big.text | wc -l
    /usr/bin/time /usr/bin/grep LiteonTe really-big.text | wc -l
            5.65 real         5.30 user         0.33 sys
        1228
    wgl:$ /usr/bin/time /usr/local/bin/ggrep LiteonTe really-big.text >/dev/null
    /usr/bin/time /usr/local/bin/ggrep LiteonTe really-big.text >/dev/null
            0.05 real         0.03 user         0.01 sys
    wgl:$ /usr/bin/time /usr/bin/grep LiteonTe really-big.text >/dev/null
    /usr/bin/time /usr/bin/grep LiteonTe really-big.text >/dev/null
            6.50 real         5.71 user         0.58 sys
    wgl:$ /usr/bin/time /usr/local/bin/ggrep LiteonTe really-big.text -c
    /usr/bin/time /usr/local/bin/ggrep LiteonTe really-big.text -c
    1228
            2.33 real         1.05 user         0.69 sys
    wgl:$ /usr/bin/time /usr/bin/grep LiteonTe really-big.text -c
    /usr/bin/time /usr/bin/grep LiteonTe really-big.text -c
    1228
            5.37 real         5.05 user         0.31 sys
The wc -l is clearly polluting the result. However, I suspect that the >/dev/null is as well. But in the worst case, I see a halving of time over the old grep (edited), which correlates with my most common use of grep in looking through source files.
Compiler is also going to make an impact which for me is consistent across both grep binaries.

FreeBSD clang version 6.0.1 as well as -O2

I suspect there are still edge cases where BSD grep is quite a bit slower or not compatible with GNU grep. However with a closer apples to apples comparison there isn't much difference anymore for my usage. Which is a lot of grep use but that is pretty vanilla.

There may also be other OS differences in our comparison. My tests where run against a fairly recent FreeBSD 12-STABLE.