|
|
|
|
|
by yummyfajitas
5585 days ago
|
|
In a connected graph, you don't want to cache this [1]. It will take up O(N^2) space. If you have 100k nodes, that would require 10^10 cached edges. Further, insertion time becomes worst case O(N^2). Consider a graph with two connected components (each having O(N/2) nodes). Once you add an edge that connects them, updating the view will require O(N^2/4) operations - you need to connect every node in the first component to every node in the second. A graph database is more specialized than SQL, so they can perform a very simple optimization. You don't scan the database, you just follow links. Links are typically stored with the node, so it's O(1) time. [1] In a connected graph, there is a constant time algorithm - return True. But that won't work if you also want edge distance. |
|
SQL doesn't scan either -- tables are indexed via either a hash (O(1)) or a tree (O(log edges)). While I do realize the speed advantage of a specialized structure, I feel this could be better implemented as a specialized index/storage mechanism in SQL than as an entirely new database system.