| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by atombender 786 days ago

The "Merkle tree" algorithm here isn't using a Merkle tree, it's just a binary partitioning algorithm. The point of a Merkle tree is that it's a tree of hashes. Also, it doesn't really solve the consistency problem the author claims is the biggest problem; yes, over time it will correct for Elasticsearch's eventual consistency, but in the short run it's just as bad as pagination.

I don't know the author's application, but I question the desire to get a consistent dump from Elasticsearch in the first place. It is very not much intended to be a "source of truth", so you're better off streaming the data from your original data source, which is presumably something like an SQL database.

That said, if you want a stable snapshot of an entire index — where your requirement is to not ever miss documents due to concurrent updates — then you can use Elasticsearch's snapshot support. Each snapshot is just that, a read-only snapshot of the data, allowing consistent reads.

The eventual consistency problem that the article describes is solved by refreshing the index. You can use "refresh=wait_for" when doing an update in order to wait for Elasticsearch to make the update searchable. You can also force a refresh. Any subsequent query will return the newest indexed data.

Since 6.x, Elasticsearch has had docvalue pagination via "search_after", which allows pagination without a durable cursor. Each cursor value is the docvalue set of the last seen document. This is consistent insofar as the set of source documents is consistent (so it's not safe against concurrent updates). There's essentially little need to use "_scroll" or offset-based pagination anymore.

2 comments

dan-robertson 786 days ago

I think the author was using the search_after strategy (or something roughly equivalent). They were using ‘cursor’ in the generic sense as a place in the list of documents, rather than the specific durable cursors api offered by elasticsearch

link

ramsicandra 784 days ago

I think this is a fair criticism. The Merkle Tree here isn't used. I was just inspired by the diagram and come up with binary partitioning solution.

In terms of performance, it's fair to say that this binary partitioning algo is slightly worse than a cursor / search after pagination since there is an overhead of checking count while the cursor pagination does not need to.

Hmm, It never cross my mind to change/correct the design of using ES as a primary data source. My guess now is it would take as much effort or higher to migrate between ES -> SQL compared to ES -> ES.

I think the snapshot approach is interesting. If I had to start over, I'll most likely explore that.

link