| HN Mirror

We're using ElasticSearch for events, too. The aggregation operators are really surprisingly fast.

That said, one major downside to ES is that it's not schemaless. You can try to use the dynamic mapping system, but it will most likely just bite you eventually, since ES is strict about coercing data types. If your data isn't completely consistent, it will actually refuse to index it. Any changes made to your schema also requires reindexing. (For some reason ES can't do in-place indexing, despite supporting storing all the original data in the "_source" field.)

If your data isn't perfectly consistent, one way to work around the mapping problem is to append a type name to every field. So instead of indexing {"user_id": "3"}, you index {"user_id.string": "3"}. This means that if you get some input data where the user_id is an int, it doesn't conflict because it will stored in "user_id.int". You have to handle the inconsistency on the query end, but it's possibly better than micromanaging the index.