|
|
|
|
|
by alexpasmantier
655 days ago
|
|
Ahh! Great catch, and thanks for taking the time to put that in writing. I did set gg to default to 4 threads, which seemed to be the optimal number on my machine for the typical repo sizes I navigate daily. Increasing the number of threads beyond that often results in unnecessary overhead for my personal use cases. I appreciate you pointing out the heuristic used in the ripgrep project. From what I understand, it also uses a fixed, machine-dependent number of threads, predetermined regardless of the task at hand (except for single-file tasks). This is something I was curious about while writing the code but couldn't fully answer due to my limited knowledge of the subject: could we potentially use a filesystem-specific heuristic to estimate the workload and dynamically adjust the number of threads accordingly? What I mean is a method, perhaps within the ignore crate, to estimate the amount of data to process—such as the number of files, file sizes, or number of lines—based on easily and cheaply accessible filesystem metadata. |
|
The only other option I can think of is to dynamically adjust. Maybe after a certain amount of work has completed, spin up more threads. But I'm not sure it's worth doing.