Hacker News new | ask | show | jobs
by gorset 5557 days ago
But you can skip ahead without parsing all the bytes since UTF-8 is self-synchronizing. The only scenario I can envision is that gnu grep wants to perform unicode normalization, to catch equivalent codepoints, and that this is implemented inefficiently.

Edit: Granted, since I'm using -c, it has to look at all the bytes to find all the newlines.