Hacker News new | ask | show | jobs
by buckhx 2758 days ago
ES is currently the main data store for zagat.com which ends up being a sink from a data pipeline and more or less being used as Key-Value store on the query side. It has worked OK for our current use case, but definitely came with some pain points. Primary key fetches were way too slow for what we needed especially with some in-memory joins happening and we ended up sticking a cache in front of ES to satisfy our performance reqs.

We had a tight deadline on implementation (3 months to extract from Google) and chose ES in order to satisfy a kv store as well as TF-IDF corpus search.

1 comments

If it's a sink from a data pipeline then presumably it's not your primary point-of-truth data store because if Elasticsearch gets corrupted you can rebuild it from the rest of the pipeline?
Yep, it's a bit more symbiotic than that unfortunately, but in general most of the data can be restored from an upstream source.