Hacker News new | ask | show | jobs
by theryangeary 268 days ago
In the past few days I've been benchmarking my own tool `choose` against BSD cut, GNU cut, and uutils cut and the uutils cut is certainly faster than BSD or GNU versions: https://github.com/theryangeary/choose/blob/master/benchmark...
1 comments

The benchmark is against unibyte text. You would get more accurate results by doing `export LC_ALL=C` in your benchmark script
I tried adding LC_ALL=C as well as LC_ALL=en_US.UTF-8 and it didn't make much of a difference outside of BSD cut[0].

The input file to the benchmark is all ASCII text chars (unibyte?) and `choose` does take the safe(r?) route and assume all text is UTF-8 and handle accordingly.

0:

          | LC_ALL=C   | LC_ALL=en_US.UTF-8   | not setting LC_ALL explicitly
            -------------------------------------------------------------------
  choose  | 110.6  ms  | 110.6  ms            | 110.8  ms
  cut     | 813.9  ms  | 983.9  ms            | 971.7  ms
  gcut    | 172.8  ms  | 172.5  ms            | 174.0  ms
  ucut    | 78.22  ms  | 79.39  ms            | 79.38  ms