| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by alexchantavy 1255 days ago

Thanks for digging and sharing, I enjoyed your snark.

> They decided to provide the data not in a CSV file like a normal human being would, but instead in a giant cypher file performing individual transactions for each node and each relationship created. Not batches of transactions… but rather painful, individual, one at a time transactions one point 8 million times. So instead of the import taking 2 minutes, it takes hours.

Yeahhh I noticed this too when I looked at the repo when their blog was posted a couple weeks back. Running a transaction for each object will of course be very slow and real production code will (hopefully) not do this.

> Those are not “graphy” queries at all, why are they in a graph database benchmark? Ok, whatever.

I’m definitely interested in seeing more realistic scenarios of actual “graphy” queries with batched transactions comparing the two. Oh, and comparing against Neptune would be cool too since that supposedly uses openCypher now (which I hear is kinda close to neo4j cypher?).

1 comments

mapleeman 1255 days ago

This is true regarding the transactions and cypherl. All data is cypherl transactions because memgraph can handle a large volume of transactions. mgbench was designed to run in-house CI/CD, and mgBench is still tightly coupled with Memgraph. That is the reason we are still running everything in transactions. We did open an issue where we plan to improve things, adding CSV support for faster imports being one of them. https://github.com/memgraph/memgraph/issues/689 Feel free to suggest things, some things Max suggested we will add. Agree on the more complex queries, and different vendors.