| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by clpm4j 1832 days ago
	Native graph storage and index-free adjacency. No tables, no JOINs.

1 comments

__jem 1832 days ago

Okay,that seems more interesting. Any resources on the data structures used to avoid indices? Without table ddl, if node types are arbitrary, that seems like a hard problem to solve in terms of storage layout.

link

clpm4j 1832 days ago

"To understand why native graph technology is so efficient, we step back in time a little to 2010 and the coining of the term index-free adjacency by Rodriguez and Neubauer. The great thing about index-free adjacency is that your graphs are (mostly) self-indexing. Given a node, the next nodes you may want to visit are implicit based on the relationships connecting it. It’s a sort of local index, which allows us to cheaply traverse the graph (very cheaply, at cost O(1) per hop).

Neo4j manages to keep traversal costs so low (algorithmically and mechanically) by implementing traversals as pointer chasing. This implementation option is available to us precisely because we bear the cost of building the storage engine..."[1]

"Each node (entity or attribute) in the graph database model directly and physically contains a list of relationship records that represent the relationships to other nodes. These relationship records are organized by type and direction and may hold additional attributes. Whenever you run the equivalent of a JOIN operation, the graph database uses this list, directly accessing the connected nodes and eliminating the need for expensive search-and-match computations."[2]

Resources if you're curious: [1] https://neo4j.com/blog/computer-hardware-native-graph-databa... [2] https://neo4j.com/developer/graph-db-vs-rdbms/

link

zozbot234 1832 days ago

> Neo4j manages to keep traversal costs so low (algorithmically and mechanically) by implementing traversals as pointer chasing.

This is how network and navigational databases worked, before modern RDBMS's were introduced. It's very much legacy tech, and skipping the indexing step brings negligible gains if any at all. Where optimizations are worthwhile, they're already being used.

link

AndrewBowman 1832 days ago

If that were so then there would be no need for native graph databases at all, and we would not be seeing cases that could not be served by relational dbs that are possible with Neo4j and native graphs.

You may be thinking of non-graph use cases. When hundreds of thousands to millions or more traversals are required to address graphy use cases, if those traversals are implemented as table joins, and the join complexity is dependent upon tables that are millions or billions in size (so dependence on total size of the data instead of just the relevant connected relationships) then you can see where pointer hopping on only relevant connected relationship and node records (proportional only to the elements of the subgraph traversed, not total data) would outperform the relational model. Also you have the flexibility of being as strict or as lenient as required with the labels of nodes traversed or the relationship types traversed, as well as their direction. That's tougher to do when you may not know what tables are meant to be joined or how, or if you pour all your nodes into a giant table, where the join cost is proportional to your total data.

Relational databases are very good at what they do. But no tool is perfect and covers all use cases easily. Design is a matter of tradeoffs, and some of the design choices made that make them excellent in many categories becomes a weakness in others. We're in an era of big data, huge data, where modeling, traversing, and exploring the connections within this data is increasingly valuable, and increasingly costly due to the sheer amount of data and the complexity of both the connections between and the use cases themselves. Native graph databases are a tool for these cases, and can also bring along simplicity in modeling and querying to the table as well as the performance that gives them an edge in these cases.

link

__jem 1832 days ago

Thanks! Will read through these.

link