Hacker News new | ask | show | jobs
Truth Behind Neo4j’s “Trillion” Relationship Graph (tigergraph.com)
61 points by kakakiki 1722 days ago
4 comments

Can't imagine the insane amount of memory you'd need for that. Tried neo4j a few years ago, loaded up a small dataset that fit in a 100mb postgres instance and had to stop halfway through when neo4j was consuming tens of gigabytes of memory.
PostgreSQL itself makes for a perfectly fine graph database wrt. general-purpose usecases (the use cases Dgraph and Neo4j also focus on - not heavy-duty network analysis!). It's only the query language that's a bit unintuitive as such (though pgSQL 14 has added the standard SEARCH and CYCLE keywords for cte's, which help a little bit). But performance is competitive.
We had the same experience. Ended up converting the entire database in Neo4j to relational tables in postgres.
Yup, same here. If you have any moderately large data set and do manage to get it loaded , then it will break with only a light graph traversal of 3 layers deep. Played with this for a couple weeks but eventually realized this is not for a large data set.
This article really rubbed me the wrong way since it was critical of a major competitor to TigerGraph. I would like to see a response from Neo4j.

Neither product is open source. There are a few open source graph databases that scale well.

No way, more companies need to do this. Yeah kinda rude in industry but comparisons like this are vital for the customer. Way too few products actually get this type of comparison.
I totally agree. I actually love this. I always assume that each company exaggerates their product's capabilities.

In highly technical areas, the number of people who can push back on marketing BS is likely to be very small and there is a good chance they are working for the competition. If companies challenge each other and then defend themselves against challenges, it gives me highly valuable information to figure out who is full of it and who is not. It also tends to draw in attention of other industry people who can weigh in, adding even more to the discussion.

Yep exactly, the adversarial system is how we do things in court, because it's the best method for truth discovery.
More transparency and public debate about the weaknesses and limitations of graph databases is helpful regardless of the source, as there is a tendency to paper over the realities in the marketing materials. The same lens could be pointed at TigerGraph or any other graph database.

All current graph databases, open or closed source, have serious deficiencies at scale. Different implementations hide these problems in different places but they all have them, and they manifest early and often. Selecting a graph database is an exercise in deciding which deficiencies you are willing to live with.

We could probably do a lot better but it has always been a bit too niche to attract the right people. (There are hardcore DS&A and database theory problems central to making graph databases work well that are largely ignored in conventional database engine designs, but most graph databases tend to be designed by people that love graphs rather than people with deep expertise in those computer science problems. Would be an interesting problem to work on.)

EDIT: I find the article to be a very reasonable and thorough explanation of why the benchmark is at best misleading and at worst deceptive.

Neo4j is open core, isn't it?

Then there are oss graph dbs like Janus, Titan, and Tinkerpop - Tinkerpop just for graph-like interfaces to oltp/olap stores - right?

Should they have said nothing? They qualify their argument pretty well. If Neo4j wants to respond they can do so.
I'm not sure the "open source" qualifier is needed there.
I'd love a more lightweight replacement for Neo4j but I need something that has good cypher support. Options are pretty limited.
RedisGraph may be an option for you.
I think PostgreSQL will at some point support Cypher.
There is a projected "Property Graph Query" extension of SQL, and the pgSQL folks have expressed some interest in it. Not sure how it will relate to Cypher, but the feel will probably be similar.
Depending on how one views "postgres", there are at least two extensions that allegedly do it: https://age.apache.org/ and the AgensGraph from which AGE derives
I think eventually all infrastructure software tends to be written in a language like C, C++, or Rust where you have fine grained, low-level control of memory and performance. For widely used infrastructure, the optimization time is justified by how much it will be used.
I'm missing the connection between this comment and the blog post.
I assume the connection is: Neo4j is not written in any of those, but in Java
An example of that would be Scylla, that's compatible with Cassandra but more performant and written in C++.