Hacker News new | ask | show | jobs
by lmkg 2460 days ago
One of the key features of graph databases is to select one node, and then recursively 'chase' edges until you find another node matching some criteria. Other database models can have trouble representing chasing an unbounded number of edges. E.g. in the relational model, following an edge to another node is usually represented as a Join operation, and SQL doesn't let you parameterize the number of iterated joins. This is especially true if the thing you want to query is actually the path length.

In an HN thread from a few days ago, someone made the claim that the graph model could be represented by SQL + recursion, and recursive SQL is an extension offered by some databases. But the relational model itself cannot fully represent the graph model.

Without digging too deep, I suspect other database models run into similar problems. E.g. a document store could very easily represent a Directed Acyclic Graph as a document, but when you get into general graphs your document needs to end on a value that is the key to another graph.

This is not agree with the claim that graph databases are generally superior. I like them, and they're fun, and I think more developers should be aware of them for cases where they apply, but I also don't think they have advantages over relational or document stores when the data is natively table-shaped or DAG-shaped.

2 comments

You absolutely can use recursive SQL to build a graph in a relational database, I've done it. You make a table with a primary key and some data and then make another table that represents edges between objects in the first table. Then, like you describe, you can use recursive queries (built into Postgres) to query your graph. You end up with an adjacency list which might not be the most efficient way to represent your graph, but it works well enough up to a certain size.

I don't quite get what you mean by "the relational model itself cannot fully represent the graph model". If you can store all the edges of a graph and query them, what can't be done?

Graph Databases, depending on the underlying storage model, can have performance increases over relational databases when the JOINs are what you are interested in. This is because JOINs in SQL are often O(n) or O(log n) (if you index the join) where N is the number of rows in the target table, but following a relationship in a graph might be O(1) if you store an object with pointers to other objects right there. Writes might be more expensive though.

It all comes down to use case

It seem similar to the SQL vs. NoSQL dichotomy in that it's a question of general ad hoc queries vs. predetermined queries. SQL lets you query anything with or without an index, but you pay for this flexibility with limits on performance and scalability.

In the case of graphs, if you know the exact shape of your graph and its query patterns ahead of time, you can design the optimal structure in a KV/NoSQL store. But if you need/want the flexibility to do ad hoc queries of the graph in a reasonably performant way, that's where the graph db shines.