Hacker News new | ask | show | jobs
by genewitch 74 days ago
considering that ripgrep has marginal overhead over just reading the files to /dev/null, how exactly does this achieve 100x speedup?

I have a lot of use for something that can search ~1GB of text "instantly", but so far nothing beats rg/ag after the data has been moved into RAM.

3 comments

The trick to optimization is not "doing faster" but "doing less". I already feel rg is missing a ton of results I want to see because it has a very large ignore list by default.
i see this - complaint? - often, but i use grep for finding text in files in the filesystem, like normal people. But specific datasets i'll use ag/rg. As an example, i have transcribed all of the "shows* i have access to for a couple of radio programs, when i want to do exploratory searches, i hit the set once with ag/rg, which takes 7-14 seconds to warm up once, then it's <1ms to search all 1500 text files or whatever.

So while i'm sure ag/rg may be frustrating to use in certain circumstances, by default it works great for searching text files, even structured text files, on disk.

alias rg="rg -iuu"
The crate says it uses SIMD, but the crate also says that content search is 20-50 times faster. Maybe the guy unsure how fast it is or how much speedup he should claim to get recognition.
it very much depends on the platform and the operating system

for example ripgrep doesn't do any memory mapping on macos which makes it 2-3x faster just becuase of that

you can try it yourself. ripgrep search for "MAX_FILE_SIZE" in the chromium repo takes 6-7 seconds, with fff it is 20milliseconds

so essentially in this specific case it is over 1000x faster, but the repo size is huge (66G, 500k files)