Has anyone overcome the 16382 positional limits of tsvector?
That and the automatic stemming and lemming of search words even in phrase searches makes postgres awful for any software where accurate search is critical.
I remember when I ran into that first issue on my forum, I felt like I was the first person to ever use Postgres full-text search. (8?) Years ago, googling the exact error brought up nothing except some dev email/listserv chatter. Nobody else was indexing longer text documents? Wasn't very encouraging.
Oh well. The vast majority of forum posts don't hit that limit so I just excluded the exceptionally long posts from the index. I never revisited it again.
I used a trigger function to detect long text and trim the source before it got indexed. That meant that any text over 500kb just got dropped from the index. I also used one index per long text field rather than combining with other fields.
The lack of BM25 relevance scoring is the bigger problem. Postgres FTS is fine as a better "LIKE" filter with very little overhead, but it's a poor choice for serious search applications or scale.
Oh well. The vast majority of forum posts don't hit that limit so I just excluded the exceptionally long posts from the index. I never revisited it again.