| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by smarx007 1091 days ago

The only thing I like in Neo4j is Cypher – it's powerful and intuitive. I don't use Neo4j because of two reasons:

1) It has no support for subgraph queries. In other words, you can't run a query on a graph and have the query result be a graph too. Instead, you will get a tabular result set. In SPARQL-based systems, you can run a 'CONSTRUCT' query. Very useful if you want to process the results by other parts of the code that also expect a graph (composability). See [1] and [2] if you want to take SPARQL for a spin.

2) It has no support for a standard graph data format. Their blog had some posts about using CSVs but they are a tabular data format, which means that some acrobatics are needed to extract a graph from CSV (actually, two CSVs) and none of this would be standard. Also some attempts to fit a graph peg into a tree-shaped hole (JSON, XML). To my knowledge, RDF is the only widely used standard to actually represent graphs. Unfortunately, there is a lot of confusion around RDF because (a) RDF is actually just a model and there are multiple file formats – I recommend Turtle, and (b) RDF has a semantic web heritage – forget semantic web and just use a graph data format.

But I know that industry is most familiar with Neo4j, that's why I mentioned it. To my knowledge, Stardog is one of the most advanced and performant systems (with on-prem deployment) but is very expensive. Amazon Neptune and Azure Cosmos are cloud-only, which is a hindrance for many projects. Bottom line is that graph DBMSs have a long way to go and more interest from the community is needed to motivate more dev effort.

P.S. For dense graphs, a graph DBMS may not be the best solution. Graph DBMSs also lose their appeal if your queries are not traversal-heavy.

[1]: https://data.nobelprize.org/sparql

[2]: https://query.wikidata.org/