| @burntsushi Hi! First of all, thank you for taking the time to write this. I've been using ripgrep for quite some time, and it's an amazing piece of software. Having your comment here is truly an honor. > I'm not sure I totally get the motivation here to be honest This is primarily a small project I started to familiarize myself with Rust. I thought that exploring the basics of ripgrep and attempting to build something similar would be a good way to get started. > Also, the flags that it does support are overriding long-held custom that are likely to be confusing to users Noted. I'll consider making these changes to avoid potentially confusing anyone. > It's also pretty annoying to share screenshots of benchmarks instead of just showing a simple copyable command with a paste of the results. I've updated the documentation with the actual commands and included a copy of the results. > I also can't quite reproduce at least the curl benchmark I just ran the curl benchmark again on the same machine (my work laptop, an M3 Apple MacBook), and here are the results: $ hyperfine "rg '[A-Z]+_NOBODY' ." "gg '[A-Z]+_NOBODY'" "ggrep -rE '[A-Z]+_NOBODY' ."
Benchmark 1: rg '[A-Z]+_NOBODY' .
Time (mean ± σ): 38.5 ms ± 2.2 ms [User: 18.1 ms, System: 207.3 ms]
Range (min … max): 33.8 ms … 42.8 ms 72 runs
Benchmark 2: gg '[A-Z]+_NOBODY'
Time (mean ± σ): 21.8 ms ± 0.8 ms [User: 15.4 ms, System: 53.1 ms]
Range (min … max): 20.2 ms … 23.8 ms 115 runs
Benchmark 3: ggrep -rE '[A-Z]+_NOBODY' .
Time (mean ± σ): 73.3 ms ± 0.9 ms [User: 26.5 ms, System: 45.7 ms]
Range (min … max): 70.8 ms … 75.6 ms 41 runs
Summary
gg '[A-Z]+_NOBODY' ran
1.77 ± 0.12 times faster than rg '[A-Z]+_NOBODY' .
3.36 ± 0.13 times faster than ggrep -rE '[A-Z]+_NOBODY' .
> It looks like it's assuming that the `ArrayQueue` it uses is never full?I used a default maximum size for the queue (configurable via the --max-results argument) to pre-allocate it, as I thought this might improve performance. However, I'm currently not handling errors properly and just allowing the program to panic when the number of results exceeds the set limit. > So why doesn't it have the same performance profile as ripgrep? Given the differences in execution times between our benchmarks, I suspect that because ripgrep's (and, by extension, gg's) performance bottleneck is primarily disk I/O, variations in filesystems and underlying storage hardware could explain the significantly different results we're observing.
What do you think? |
Since you're running on macOS, I'll do the same. I have an M2 mac mini. My previous benchmarks were on my Linux workstation. Your `curl` benchmark:
So slightly edged out by `gg` here, but not as big of a difference as you're seeing. What version of ripgrep are you using?Also, as I said before, these times are pretty short. Try a bigger corpus. For example, in my clone of Linux (also on my M2 mac mini):
It is very interesting that the differences are almost zero on macOS but quite a bit bigger on Linux. That might be worth investigating.IMO, if you're advertising "circumstantially faster than ripgrep," then you should be able to characterize the circumstances in which that occurs.