Hacker News new | ask | show | jobs
by mgedmin 5298 days ago
Is this really a bug in grep, rather than a bug in Solaris's libc? I've never seen grep so slow, and I've been using UTF-8 locales for years.

I'm not denying that grep was buggy (there's a link to grep's bug tracker to a bug that was closed more than a year ago), but I'm surprised at the magnitude of the slowdown.

2 comments

The official GNU grep used to be absurdly slow at UTF-8. Linux distributors very quickly noticed this and fixed it when they switched to UTF-8 by default. But GNU grep maintenance was essentially dormant for years and these patches were only integrated in 2010.

For an old, unpatched GNU grep a 2000x slowdown is quite believable.

The poster is using grep 2.5.3. Release notes for 2.6 talk about UTF-8 performance improvements:

"This release fixes an unexpectedly large number of flaws, from outright bugs (surprisingly many, considering this is "grep") to some occasionally debilitating performance problems."

Current version of grep is 2.10.