Hacker News new | ask | show | jobs
by EGreg 4167 days ago
The main reason is speed. Graph traversal is constant time as opposed to O(log n) for each traversal. The speed only becomes relevant later, and you can therefore add graph databases as a slightly delayed replica on top of a db for things you need to traverse fast.
2 comments

This is true, however I think this again conflates two things: the relational model and the "tabular" data layout. I gather that databases like Neo4j obtain constant time traversal by storing each node's list of "relationships" at the same location as the node itself. We might be able to generalize this to a relational database, where every row in the database is stored together with the list of addresses of all other rows which reference it via a foreign key.
How? Describe a detailed solution.
PostgreSQL has an "array" data type (http://www.postgresql.org/docs/9.4/static/arrays.html), as do other databases. You could use that for that purpose (I know of research prototypes that have attempted this). It is unlikely to work well in general, especially given that degree distributions are often skewed (i.e., there is very high spread in the number of relationships per node) -- if I had the time, I would do a proper empirical comparison.
Couldn't this be improved with new data types and types of indexes in a relational db, similar to spatial types/indexes?