Hacker News new | ask | show | jobs
by bilbo0s 4515 days ago
This. A THOUSAND TIMES "This".

The one drawback ES had in the bad old days was that backup and restore was a nightmare... ESPECIALLY on AWS. The new system they introduced was so simple I was concerned about updating to it because I was SURE something would go south.

But it all just worked.

I still have the Couch to ES replication running because I'm anal like that... but really... yeah... you can do without Couchbase, Mongo et al... ES will probably do everything you need PLUS everything you can't do in the others.

3 comments

As a proud user of Elastic search since the early days I'm happy to see so much progress. Never mind about the *search part of their naming it's really a database for all practical purposes, especially for web data.
to be fair, the main selling point of mongodb is that developers can access it more easily. i haven't really touched mongodb in over a year and then only for playing, but have you tried the elasticsearch filter query syntax? have you compared mongodbs syntax?

also, i have the exact opposite nitpick. people want to use it to do everything, mail indexers, file system indexers. what's the matter with web developer folks? why is it that when the next database comes around they want to use it for everything?

"....why is it that when the next database comes around they want to use it for everything?...."

Because they like a simple web stack. KISS means a faster time to market. Faster time to iterate. Faster time to fix bugs because there are fewer places those bugs can be. All of that doesn't even factor in the productivity benefits gained by not having to switch technologies from project to project.

But to be fair, ES is not some brand new database... ES has been around for a LONG time.

Apache Lucene has been around awhile. ES has been around since 2010.
Yeah...

that's a pretty long time.

Just curious, if I'm using say version 0.92, how would I go about backing up my ElasticSearch instance. Besides creating a replica in a server, then "freezing" it by disconnecting the server?
The pre-Snapshot/restore method is:

- Pause indexing

- Issue a flush request

- Rsync data directories somewhere

- Resume indexing

This is technically a very naive approach, since a simple rsync of the data dirs will include replicas too. If you were more diligent you could check the state files in each shard directory and only copy out the primaries.

Polyfractal is right.

You can just google "elasticsearch rsync" to get information, and even scripts, that will do this for you. The thing is... you REALLY need to know what you're doing when you go this route.

Also, you can try the gateway feature. Gateway is actually pretty straightforward. Restore WILL be slow though. And for many scenarios ... it is not ideal. (You don't want to take a day, or even a few, to restore after a failure.)

I think the best advice is...

Update to 1.0.

Just go to 1.0 and do snapshots... you will save yourself A LOT of headaches.