Hacker News new | ask | show | jobs
by rickmode 2533 days ago
When using ES for indexing and not the primary store, you can (and should) periodically fully reindex the data set. You can use a blue / green pattern — create a new index then swap from the old one to the new one. ES supports aliases, making this swapping transparent to the apps using the index. Now you have more options.

If it is easy to delete specific users from the primary database, the deleted users will naturally disappear during the next ES reindex.

Edit: The old index is deleted at the file system level.

If the reindexing occurs daily or weekly, perhaps this will satisfy GDPR.

There are other good reason to not use ES as the primary data store. First, it isn’t entirely reliable. It’s good and I’ve never seen a corruption, but ES and Lucene’s history isn’t as a reliable database. Second, if you want to change how you index, it is a bit easier to do if the source data is outside of ES.

1 comments

thanks, I wasn't arguing that using ES as primary was good. Just don't necessarily see the GDPR argument as being a reasonable one. Although I've seen some startups using Mongo as primary and have to wonder if there would be that big a difference in using ES at that point (not a Mongo dig as I've kept away from it for various reasons)