| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by neunhoef 3752 days ago

One of the developers of ArangoDB here.

Let me explain this quotation. When your graph data (including indices) do no longer fit into the RAM of a single server, you can either live with the higher latency of loading data from disk or you can use sharding, which will lead to communication and therefore slower traversals.

That does not mean that things stop working, but performance will be less good, you can no longer visit tens of millions of nodes per second in a traversal as in RAM on a single server.

If you actually only traverse a much smaller hot subgraph, I would probably go for the disk based single server approach.

If your graph has a natural known clustering, then an optimized sharding solution with fine tuned sharding keys us probably your best bet.

You can do all this with ArangoDB.

However, graph traversals vary greatly in many respects, and your mileage may vary accordingly, with any approach.

I would love to chat in more detail about your use case.