|
|
|
|
|
by burntsushi
2928 days ago
|
|
Note that you don't need Boyer Moore for the common case. ripgrep for example will very rarely use Boyer Moore. Its work horse is much simpler and typically faster: https://github.com/rust-lang/regex/blob/master/src/literal/m... In Go-land, you should be able to replace uses of memchr with IndexByte[1], which should be implemented in Assembly on most platforms. Of course, for any of this to have a big impact, you'll want to take Mike Haertel's advice on avoiding line breaking and stop using bufio.Scanner. :-) [1] - https://golang.org/pkg/bytes/#IndexByte |
|
There are a couple of places I wish I would have done better. Using bufio.Scanner actually bothers me a lot. Also in the Read() method it reads everything from all readers into a buffer instead of pulling what it needs to check.
Thanks for suggestions :)