|
|
|
|
|
by glangdale
2691 days ago
|
|
> 2. Nevod matches MULTIPLE patterns against document in ONE PASS. Patters/expressions are indexed by state machine and filtered effectively during matching. This is a good idea. It's such a good idea that we did it in 2006, sold a company to Intel based around in in 2013 and open-sourced the resulting library (https://github.com/intel/hyperscan) in 2015. All multi-pattern systems get a huge improvement over one-at-a-time regex. It would be embarrassing if they did not. Our rule of thumb for Hyperscan was that it should be about 5-10x faster than libpcre on single patterns and the gap should get progressively bigger. I don't doubt that there is quite a bit of interesting work (entirely outside of regex - the word-based focus looks interesting) going on in this project, but the fact that you couldn't be bothered to Google around for a sensible baseline is not encouraging. Also, which bit of your system do you think is novel enough to patent? You may find considerable prior art out there, which you don't really seem to be aware of based on how much emphasis you're placing on the not-entirely-novel idea of using a state machine. |
|
However, we all need to keep in mind that Nevod operates on words, not characters, and Nevod targets large rule decks (1000-100000 patterns). So, we don't compete in the field of individual pattern matching or streaming matching (to some extent, goals of Hyperscan and Nevod are different). In other words, a lot of things depend on target goals, which usually are reflected in the kind of patterns and in their number.
Regarding patent. Sorry, for legal reasons I can't expose details at the moment.
Thank you for your valuable comments.