Hacker News new | ask | show | jobs
by aPoCoMiLogin 33 days ago
to be honest, the way compound index is implemented in mongo is problematic (index intersection never worked for me). you have to tinker a lot sometimes, so the query params are in the right order so the query planner don't decide to pick other index or not use index at all.

not sure if this was fixed in recent versions, but if you have index on array of strings/ints, and then use `addToSet` operator, the index is not used for the `addToSet` operation, and the disk is hammered for each update to verify if the content of the array has unique values. with pretty high volume, you can easily saturate the disk queue, wondering what is happening and how you can fix that

after few years of working with few TB collections (tables) in mongo, I can say that this DB is good if you want to insert data, but don't need to read the data that much often. if you need json and read heavy DB, consider something else eg elastic (with all it's drawbacks), because at some point the limiting factor for you will be indexes in mongo, that are the most weak part of this DB (even more weak than the infamous data loss)

1 comments

That's a valid concern. I completely understand that compound indexes should not be applied blindly, and the database should select an appropriate execution plan on its own.

That's also why I decided to concentrate on the entire workflow rather than on the index itself in the article: Find the slow query -> Analyze Explain -> Analyze the ratio between scanned and returned documents -> Create an index -> Test the same query again...

In this particular case, a compound index was selected based on filters and a sort clause. I completely agree that using index intersections in a real workload could be quite dangerous, especially if MongoDB chooses to use another index instead.

The `$addToSet` example is also very valid; however, for me, that is more like a schema problem than a regular indexing problem in a read query. Still, the array could become extremely big regardless of the presence of indexes, and it might require a dedicated note in the article.