| HN Mirror

Yes, trigram mainly but also bigram and/or combination of both are used generally to implement fuzzy search, zoekt also uses trigram index. But such indices depend heavily on the content being indexed, for example if ever encounter a rare "trigram" during querying not indexed, they would fail to return relevant results! LSH implementations on the other hand employ a more diverse collection of stats depending upon the number of buckets and N(-gram)/window-size used, to compare better with unseen content/bytes during querying. But it is not cheap as each hash is around 30 bytes, even more than the string/text being indexed most of the time ! But its leads to fixed size hashes independent of size of content indexed and acts as an "auxiliary" index which can be queried independently of original index! Comparison of hashes can be optimized leading to a quite fast fuzzy search .