Hacker News new | ask | show | jobs
by StephanTLavavej 2247 days ago
The smallest test case appears to be a needle of "aaa" and a haystack of "xxaaa", where VS 2019 16.5's boyer_moore_searcher will report "not found", while the corrected code finds the needle at offset 2 in the haystack.
1 comments

Seems astonishing that nobody has run into this in the wild. How does it do on finding ‘www’ in ‘http://www.example.com/, for example?
I conclude that usage of std::boyer_moore_searcher is relatively low, despite this C++17 feature shipping in VS 2017 15.3 (August 2017).

"www" triggers the bug like all 3-character repeats, but Boyer-Moore manages to find it in "http://www.example.com/" despite the damaged delta2 table, because the incorrect shift value is never exercised. However, a haystack of "https://www.example.com/" triggers the bug, because now the "www" is at an offset of 8.

You need a needle that triggers the problem, AND a haystack where the delta2 table matters, AND the algorithm to stop at a problematic index where it uses a bad delta2 value.

There's a reason the published version even in 'the papers' was wrong for 3 years.