If you're interested in experimenting with Datalog rather than necessarily writing something for production, have a look at Datalog Educational System: http://des.sourceforge.net/
Thanks for sharing. There is one very significant conceptual error early on, however, and it is captured first in this statement: "The :- means if and only if, or iff". `:-` means if - or more precisely represents material conditional - where the consequent is on the left and the antecedent is on the right. iff is logical biconditional.
Indeed, ":-" is meant to represent the left-facing arrow of implication. In logic programming papers it is common to typeset it as an actual arrow, for example:
I always remind myself that it's not an if-and-only-if with this argument: since there could always be another rule that is satisfied to make that fact true, it can't be iff.
Though coming from Datomic, I'm curious how much of my knowledge is Datomic-specific rather than how you'd generally approach a database queryable with Datalog. For example, do you need four indexes like Datomic (https://docs.datomic.com/on-prem/indexes.html) to make Datalog queries fast?
Great post! Still working through it, but there is a slight error in the nested diagram at the start. Relational algebra has set difference, which is akin to negation-as-failure, but it lacks recursion. So the positive Datalog and RA circles should overlap without either containing the other. See http://www.lifl.fr/%7Ekuttler/elfe/biblio/datalog-overview-g...
Nice post. Still, I find the most accessible article describing datalog is "What you Always Wanted to Know About Datalog (And Never Dared to Ask)." by Ceri, Gottlob, Tanca (1989)
Suppose :- means iff. Turtle is Mortal (lines 2+4, implication to the left). Because Turtle is Mortal, it must be a Human (line 3, implication to the right).
Is it really valid according to Datalog semantics?
If model-theoretic semantics and the various ways to slice, dice, and extend Datalog are interesting to you, then almost any talk by Peter Alvaro might be as well.
Transitive closure is the first thing nearly every introduction to datalog (or Prolog, for that matter) will show you. All you had to do was click the link and scroll down:
Re: complexity, read up on "semi-naïve evaluation" for the usual execution strategy, which basically involves keeping lists of newly-added tuples in each iteration of the fixpoint loop and only processing deltas related to those new additions. This, combined with good index inference for each relation (predicate), so that e.g. the node-neighbor lookup is O(log |V|), should result in efficient execution...
Termination is achieved in brian_cloutier's example (https://news.ycombinator.com/item?id=19423320) by the use of different clauses (e.g., `Path` vs. `Edge`), so that all recursion is productive.
Yes, but there's more to it than just that. It's possible to write Prolog programs which never derive new atoms yet still fail to terminate.
It might be more accurate to say that Datalog programs cannot derive new atoms and this allows Datalog interpreters to use a search strategy which is guaranteed to terminate.
EDIT: Thinking about this more, I'm not sure why Prolog couldn't also use breadth-first search. So maybe both are necessary: Datalog not only disallows creating new atoms, but it also has a better search strategy; the combination of the two results in guaranteed termination.
My understanding is that Datalog terminates because it does not allow functions as arguments to predicates. No functions means it's not possible to create infinite terms:
P(f(x)).
p(f(f(x)).
p(f(f(f(x)))).
... etc
In fact I understand that termination is guaranteed even if a datalog program is executed by a Prolog interpreter. Or in other words, it's a result of the language semantics, not its implementation.
(but I might be wrong about this- corrections welcome).
I haven't tried datascript, which appears to support negation. Maybe I will try that if/when I revisit this interest someday.