|
|
|
|
|
by burntsushi
2691 days ago
|
|
While we're wishing for things, I'd love to see you write up how some of the algorithms (like Teddy) work. Hyperscan is clearly a treasure trove of them, but the effort required to extract its secrets is quite large! For example, I've spent quite a bit of time perusing its code (and your excellent mailing list posts), and I still don't think I could describe its execution flow at a high level. As far as literals go, I was doing that well before I heard about Hyperscan. I got the idea from hints in the literature. (On mobile or else I'd provide a link.) |
|
As I don't work for Intel any more, it's unlikely that I will put in the effort for anything significantly more comprehensive. I am considering another building another regex matcher but it wouldn't be the encyclopedic list of optimizations approach. Paul Terra refers to Hyperscan as the "Dwarf Fortress of regular expression matchers". I have some new ideas I'd like to try out, in any case.
I wasn't suggesting that your use of literal factoring comes from Hyperscan, only the Teddy matcher. Literal factoring seems an old technique and I don't know the genuinely first cite of that.