|
|
|
|
|
by gutensearch
1979 days ago
|
|
Thanks for noticing!
To be more specific, "what is a" are stop words and "cynic" is very common, so a lot of rows are returned (see my other comment). ts_rank takes too long to rank them, and the server times out, leaving you with the previous query's table because I didn't take the time to program a correct response to this issue. "Cecil Graham. What is a cynic?" returns Lady Windermere's Fan almost instantly. The workarounds I've thought of would be to cache these queries (assuming I've seen them before, and after I've set up logging), buy a larger server, or pay Second Quadrant to speed up ts_rank... I'd love any suggestions from more experienced Postgres engineers! Edit to your edit, re parsing. The subset of rows returned follows: where language = %s::regconfig and textsearchable_index_col @@ phraseto_tsquery(%s::regconfig, %s)
and relevance is determined by: ts_rank_cd(textsearchable_index_col, phraseto_tsquery(%s::regconfig, %s), 32)
with the %s being language and paragraph respectively. |
|
As others have mentioned, there isn't one "right answer" but there are a few nifty tricks you can use to boost performance.
You could try a trigram index[0] as the primary search index. You'd loose stemming, but gain language support and indexed regular expression search which is a nice power user feature.
Looking at your where clause, did you index language and textsearchable_index_col together ? If not, you are giving the planner the opportunity to skip your full text index.
if you order by ts_rank_cd then the query will sort all rows, including those that don't match. A better pattern is to take the ranking score as a column and then sort it in a subquery. [1] from stack overlfow has an example. (As an aside, from pg2, CTEs are no longer an optimization fence, so you can write the query out more cleanly with a CTE and still get the desired performance).
You should experiment with GIN vs GIST indices. GIN are larger, so on a large dataset the index might not fit in memory. You could pay more to have more memory, but worth trying a GIST index to see if that makes things faster just because it fits in memory.
A final frontend comment, I'm a fan of infinite scroll for this kind of stuff. You already have pagination effectively set up, you could spoil me as a user with infinite scroll. react-virtualized[2] is a great library for that.
Hope that helps.
[0] https://www.postgresql.org/docs/9.6/pgtrgm.html [1] https://dba.stackexchange.com/questions/4054/ts-rank-massive... [2] https://github.com/bvaughn/react-virtualized