Hacker News new | ask | show | jobs
by mikljohansson 2784 days ago
Our query complexity and query response times follow a somewhat exponential pattern, with a lot of quick queries and a long tail of more monstrous queries.

We do have a large number of very complex queries coming in. What we deem to be towards the "simple" end could easily be hundreds of terms, wildcard, near and phrase operators. "Difficult" queries are things with hundreds of thousands of terms or many wildcards within nears/phrases that expand to millions of term combinations.

For Meltwaters customers it's usually important to get both very high recall and precision in the dataset described by a query. Since it's very little about getting a ranked result list and finding a hit (e.g. like what Google does). It's much more about running analytics/dashboards/reports/trends over the dataset delineated by the query, and exactness in the analytics matter a lot to Meltwater customers.

This all makes for complicated queries, to get both high precision and recall for whatever a customer is interested in analyzing. Our sales and support organizations help customers write good queries, and we also use AI systems to generate queries