Hacker News new | ask | show | jobs
by aragot 4466 days ago
1) Subwords

Why not using: MY_COLUMN ~ '.wurst\y.' ? Here is the doc for LIKE, SIMILAR and regexes: http://www.postgresql.org/docs/9.2/static/functions-matching...

1 comments

Because using a real full text solution goes much farther than using regexes and LIKE. It also allows the use of indexes which many regexes and many LIKE patterns would not allow.

For example, I can now find the "wurst" in "Weisswürste" which, yes, I could do with a regex, but I can also find the "haus" in "Krankenhäuser" and all other special cases in the language I'm working with without having to write special regexes for every special term I might come up with.

Exactly. The only reason I have to bother with Solr at the moment is to get efficient ngram indexing for sequence data; which consists of lists of 7-8 character strings from various sources. What I have works, but feels like overkill for my case.