| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by esac 3587 days ago
	I'm pretty sure the bottleneck is disk I/O and not the CPU

2 comments

ggreer 3587 days ago

When searching for a literal string, the bottleneck tends to be memory bandwidth. When doing regex searches, the bottleneck is usually CPU. If caches are cold, then disk I/O is the limiting factor. Even in that case, technologies like NCQ allow some degree of concurrency.

If you have ag[1], you can play around with the --workers option to see how various numbers of threads change performance. (The default is for ag to use #CPUs-1 workers.)

1. https://github.com/ggreer/the_silver_searcher

link

exDM69 3587 days ago

Even when disk caches are involved (ie. data in RAM, not cache), a typical grep application (short "needle" to search for) should saturate the memory bandwidth before CPU cores.

Running a few threads/processes in parallel could improve throughput with latency hiding, but adding more shouldn't give any benefit.

link