Hacker News new | ask | show | jobs
by gngeal 4569 days ago
Could anyone explain to me what it means "native" versus "non-native" graph processing in that slide show? Ditto for "native" versus "non-native" graph storage. I simply have no idea what I'm supposed to picture when I see that.

Also, on the neo4j.org page, the claim that "graph data model['s] expressiveness supersedes the relational model" seems a little bit spurious, seeing as, as I understand it, the relational model and graph data are both anchored in first-order predicate logic, and therefore should be able to do the same things essentially (although Codd-style RDBMS with a little bit more fuss regarding the necessary schemas).

2 comments

Could anyone explain to me what it means "native" versus "non-native" graph processing in that slide show?

One of the leading native graph processing engines is GraphLab (http://graphlab.org/); however, the creator of GraphLab, Dr. Joey Gonzalez, is now focused on GraphX, which is essentially GraphLab built on Spark (http://spark.incubator.apache.org), which is a non-native analytics platform.

Building a graph-processing engine on a general processing system like Spark makes pre-processing and post-processing much easier.

See "Introduction to GraphX - Presented by Joseph Gonzalez, Reynold Xin - UC Berkeley AmpLab 2013" (http://www.youtube.com/watch?v=mKEn9C5bRck)

Also, a bunch of advancements in graph processing are coming down the pipe, which will be released in a few months (see https://news.ycombinator.com/item?id=6786563).

Ditto for "native" versus "non-native" graph storage.

See this post by Dr. Matthias Broecheler, the creator of Titan (https://github.com/thinkaurelius/titan/wiki)...

"A Letter Regarding Native Graph Databases" (http://thinkaurelius.com/2013/11/01/a-letter-regarding-nativ...)

So essentially, it's totally meaningless marketing bullshit? As much as I favor memory optimizations, I think that merely trying to linearize the access patterns is completely futile in the case of graph databases. On that level of brute-force approach to speeding things up, you'll most likely gain more performance by using lower-latency memory modules, or simply by using different data structures to accommodate for your specific cache line sizes and latencies, then by trying to linearize generic graphs.
You might appreciate this landscape distinction. http://www.slideshare.net/slidarko/titan-the-rise-of-big-gra...

Furthermore, please have a look at "On Graph Computing" for a break down of 3 different categories of graph computing systems -- toolkit, database, analytics. http://markorodriguez.com/2013/01/09/on-graph-computing/

Finally, yes -- there is no theoretical expressivity gains between RDBMS and property graphs (and, RDF graphs). Nor is SQL (Turing Complete versions) any less expressive than Gremlin (Turing Complete path recognition). The only argument you can make is that graphs are more (or less) effective in terms of conciseness of expression and speed of execution at particular problems. Typically (as expected), its the difference between problem datasets that look like networks (graphs) and those that look like spreadsheets (tables).