Hacker News new | ask | show | jobs
by agconti 3087 days ago
The Postgres docs suggest using another column. My guess is that an expression index would be too large if it held the tokenized value of all of your FTS documents. These things can and often are entire written documents. Imagine the index size for 2 Million rows of tokenized documents at 2,000 words each.

You might be able to get away with if if you were indexing a less then large amount of very small documents.

I like your expression index idea a lot.

1 comments

But if you're maintaining a separate ts_vector column, which you then index, you're creating the exact same amount of index data.

Unless you're saying that you would populate the field only on some rows and not all of them, and control this from the app. But you could do that with an expression index, too, assuming the rule is a simple, pure function:

  CREATE INDEX index_posts_on_body
  ON posts (to_tsvector(body, 'english')) 
  WHERE published = true;
or similar.