| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by EGreg 4214 days ago
	The main reason is speed. Graph traversal is constant time as opposed to O(log n) for each traversal. The speed only becomes relevant later, and you can therefore add graph databases as a slightly delayed replica on top of a db for things you need to traverse fast.

2 comments

jamesfisher 4214 days ago

This is true, however I think this again conflates two things: the relational model and the "tabular" data layout. I gather that databases like Neo4j obtain constant time traversal by storing each node's list of "relationships" at the same location as the node itself. We might be able to generalize this to a relational database, where every row in the database is stored together with the list of addresses of all other rows which reference it via a foreign key.

link

zkhalique 4214 days ago

How? Describe a detailed solution.

link

anonetal 4214 days ago

PostgreSQL has an "array" data type (http://www.postgresql.org/docs/9.4/static/arrays.html), as do other databases. You could use that for that purpose (I know of research prototypes that have attempted this). It is unlikely to work well in general, especially given that degree distributions are often skewed (i.e., there is very high spread in the number of relationships per node) -- if I had the time, I would do a proper empirical comparison.

link

SigmundA 4214 days ago

Couldn't this be improved with new data types and types of indexes in a relational db, similar to spatial types/indexes?

link