|
|
|
|
|
by erikwitt
3424 days ago
|
|
That is absolutely right. You can easily write queries that can never be executed efficiently even with great indexing. Especially in MongoDB if you think about what people can do with the $where operator. What would in retrospect be your preferred approach to prevent users from executing inefficient queries? We are currently investigating whether deep reinforcement learning is a good approach for detecting slow queries and making them more efficient by trying different combinations of indices. |
|
I did some deep diving in large customer performance near the end of my tenure at parse to help some case studies. Frankly it took the full power of Facebook's observability tools (Scuba) to catch some big issues. My top two lessons were
1. Fix a bug in our indexer for queries like {a:X, B:{$in: Y}}. The naive assumption says you can index a or b first in a compound index and there's no problem. The truth is that a before b had a 40x boost in read performance due to locality
2. The mongo query engine uses probers to pick the best index per query. If the same query is used in different populations then the selected index would bounce and each population would get preferred treatment for the next several thousand queries. If data analysis shows you have multiple populations you can add fake terms to your query to split the index strategy.