| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by alostpuppy 1288 days ago
	Any graph dbs that so scale?

6 comments

jandrewrogers 1288 days ago

Not really, but that depends on your definition of “scale”. To make one that scales well, you’ll need to solve a few difficult computer science problems that conventional database architectures don’t need to consider and therefore don’t address. These are problems like graph cutting, multi-attribute search without secondary indexes, cache-less I/O schedulers, and a couple others. Scalable solutions to all of these problems exist independently but I’ve never seen any graph database implementations that even attempt to address most of these, never mind all of them, and you kind of need to remove these bottlenecks.

As long as most graph databases are just a layer of graph syntactic sugar sprinkled on top of a conventional database architecture, they won’t scale.

link

HyperSane 1288 days ago

Horizontaly scaling graph DBs is incredibly hard.

link

holler 1288 days ago

What's your thought on AWS Neptune?

From the marketing page below:

"Scale your graphs with unlimited vertices and edges, and more than 100,000 queries per second for the most demanding applications. Storage scaling of up to 128Tib per cluster and read scaling with up to 15 replicas per cluster."

https://aws.amazon.com/neptune/

link

HyperSane 1288 days ago

That isn't real horizontal scaling, they are just doing a single vertically scaled writer and read only replicas, just like they do with RDS. It probably is using the same infrastructure for it.

link

rrwright 1288 days ago

1,000,000 events per second with https://Quine.io

https://www.thatdot.com/blog/scaling-quine-streaming-graph-t...

Full disclosure: I work on this project.

link

jandrewrogers 1288 days ago

When I first saw the benchmark result I was pleasantly surprised by the performance, you rarely see that on a single large server but it is achievable if the implementation properly does all the hard bits.

Then I saw that it required 140(!) servers to achieve that result and now I’m wondering what all that hardware is actually doing. On a per-server basis, that is very low throughput, even for graphs. Efficiency that low will make it uneconomical for most graph applications.

link

rrwright 1288 days ago

That’s not an optimized benchmark, just a demonstration using a real customer workload. Throughput depends on the workload but goes up to about 20,000 events per second per machine. All this is while simultaneously querying the graph and streaming out 20,000+ events per second. All that includes durable storage.

Price it out against Neptune instead and Quine is much less than 1% of the cost.

link

lolive 1288 days ago

On a previous project, we reached the limit of 32 billions of unique IDs (Neo 3.2 if I remember well). And had to wait for the next version so we could add more data.

I left the project. But, as far as I know, there are still planes maintained and authorized to fly in the sky, so I suppose the DB is still up in production.

link

lolive 1288 days ago

And I am pretty sure the number of IDs in the DB has skyrocketed since that time.

link

logicalmonster 1287 days ago

JanusGraph may or may not fit your needs, though there's quite a learning curve even figuring out how to set it up.

link

kleinsch 1288 days ago

Tao

link