Hacker News new | ask | show | jobs
by okram 3959 days ago
Gremlin is an Apache Software Foundation query language and as such, can be used by any graph system (Titan, Neo4j, OrientDB, etc.). It is not bound to a particular vendor.

Gremlin has a natural compilation to the common distributed vertex-centric computing model (bulk synchronous parallel for graphs). Thus, Gremlin works for both OLTP (graph databases) and OLAP (graph processors). The Apache distribution provides OLAP connectivity to Apache Hadoop, Spark, and Giraph.

Gremlin supports both imperative path expressions and declarative pattern matching.

Gremlin can be embedded in any host language. No "fat string" with result set. The user's database query code and data manipulation code are in the same language. There exists Gremlin-Java8, Gremlin-Groovy, Gremlin-Scala, Gremlin-Clojure, Gremlin-PHP, etc.

Gremlin is Turing Complete. Most any complex enough language is. However, Gremlin is related to a Turing Machine by a very simply mapping.

See http://arxiv.org/abs/1508.03843 for detailed specifics of the aforementioned benefits.

2 comments

Thanks for the post! You mentioned Spark via OLAP connectivity. Can you please elaborate a little on how gremlin works with spark? Does it use the GraphX API behind the scenes or is it just spark? Are there any sources on how well it works?
Gremlin (over Spark) does not use GraphX. It simply represents the graph as a tensor RDD (i.e. a multi-layered matrix) and with the Spark functional library, it implements BSP-based vertex-centric computing (i.e. message passing). You can see examples and a diagram explaining how it works at this location:

http://tinkerpop.incubator.apache.org/docs/3.0.0-incubating/...

Looking forward to titan 1.0 :)