| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by neunhoef 4066 days ago

(Disclaimer: Max from ArangoDB here)

The purpose of this benchmark series was not to provide a comprehensive test of all these databases. We only wanted to demonstrate that a multi-model database can successfully compete with specialised solutions like document stores and specialised graph databases.

I agree to your comment about graph databases, the crucial thing is that the neighbors of a vertex can be found in time proportional to their number, and that the queries involving an a priori unknown number of steps (graph traversals, path matching, shortest path, etc.) run efficiently in the database server and can be accessed conveniently from the query language.

1 comments

ThePhysicist 4066 days ago

Understood, I did not mean to be overly critical, I just think that there is a lot of misleading information out there concerning graph databases, and for many people it is hard to get good information about their real benefits and drawbacks.

ArangoDB seems to be a very interesting project btw, I might evaluate it again for my project in the future (we are currently creating a very large graph of code data, so we need something that can scale beyond 1B nodes and 100B vertices)

neunhoef 4066 days ago

No worries, yes, there is a lot of bad information about graph databases, to begin with, some seem to believe that everything out there can best described by a graph, which is clearly wrong. I have myself written something about this in this article: https://medium.com/@neunhoef/graphs-in-data-modeling-is-the-...

Furthermore, I am currently working on another article for the O'Reilly radar blog presenting a nice case study in which a multi-model database was very useful, because document queries and graph queries were both used extensively.

1B vertices and 100B edges will definitely be a challenge for any graph database and I find it highly likely that ArangoDB in its current version will not show a very good performance for a data set of this size. Obviously, it will always depend on the particular queries you need, and on whether the graph has a natural cluster structure that can be used for sharding.