I am very happy to see this. Clojure is being used in more and more applications and seems to be quite popular in larger-scale data analytics applications, where many other solutions struggle.
It is particularly strong when you want to build a data processing/analysis pipeline for moderate amounts of data (say, below a TB). With the fantastic built-in concurrency support, small external tools (core.async parallel pipelines and perhaps aphyr/tesser for parallel reducers) and the performance of the JVM, you can crunch through these amounts of data on a laptop, without involving anything bigger like hadoop. Saves time and makes for good consulting margins (I speak from experience).
It also happens to be the language which I use to develop fairly large applications, so I guess you could call me biased :-)
Why not? I think lisps could use a visibility boost in the various data sciences. Also probabilistic programming tends to be highly declarative, which works well in a language family where "code is data" is a core philosophy.
At first glance, maybe Common Lisp would have made more sense. As far as I'm aware, the CFFI is pretty good, which means you can interact with other machine learning code that is largely written in C under the hood.
That said, there is a mature machine learning ecosystem for the JVM, not to mention the prospect of running massive probabilistic models on Yarn or Spark.
By the way, there are many probabilistic programming languages in addition to PyMC3. There are Stan, Edward, Pyro (fairly new), and even the venerable BUGS/JAGS.
One advantage of Clojure compared to Common Lisp is very well implemented parallelism and corresponding functional data structures. Something which is hard to find for Common Lisp implementations.
One reason might be macros. Since it's a Lisp, it's very easy to implement custom syntax for the abstractions they want to expose, like defquery in their example. Something like that could only be loosely approximated in Python. Probabilistic programming, being a whole different programming model benefits from this more than most libraries would.
It seems that you can run it directly in the browser with clojurescript, which might help a lot with educating about the project. If memory serves, church did this.
Also there is some tradition on using lisps for AI; church, for example, is implemented on top of scheme.
Other probabilistic programming languages use prolog as ascendent and Prolog and lisps such as clojure are quite related.
Wouldn't a language with support for continuations (Scheme or Racket, speaking of the Lisp family) be a better choice for the task and allow this to be implemented as an embedded DSL rather than a separate (even if it is tightly integrated with Clojure) language?
"Embedded DSL" vs "Separate" is a matter of use of words. Anglican is macro-compiled inside Clojure into Clojure. If you like to call it a DSL, call it a DSL.
An important design feature is that it is 'anti-DSL' --- the syntax is exactly the Clojure syntax. But the program is transformed into an extended CPS form and run through inference executor. There is an academic paper written by Anglican team explaining the design and internals:
My mistake, I misread the docs. Anglican can actually be considered to be an embedded DSL.
> An important design feature is that it is 'anti-DSL' --- the syntax is exactly the Clojure syntax.
This is not "anti-DSL", this is what gives Anglican embedded DSL qualities.
And yet there are limitations: «The border between Clojure and Anglican is subtle and usually will pose no problem to most programmers, however, some confusion can arise from the fact that Anglican programs are macro compiled into CPS-style Clojure functions. This means that some wrapping of “native” Clojure functions needs to happen in order to use them in Anglican. Errors arising due to misunderstanding this boundary crop up in the form of “wrong number of argument” exceptions».
Would that language border, wrapping of functions and transformation to CPS be still necessary in a language with native support for continuations?
Anglican is a CPS state monad. Clojure functions are automatically lifted into the monad, but to maintain fast execution of Anglican code, Clojure functions must be declared as such to Anglican.
A language with native support of continuations is a nice toy but does not bring practical benefits to implementing CPS state monad.
It is particularly strong when you want to build a data processing/analysis pipeline for moderate amounts of data (say, below a TB). With the fantastic built-in concurrency support, small external tools (core.async parallel pipelines and perhaps aphyr/tesser for parallel reducers) and the performance of the JVM, you can crunch through these amounts of data on a laptop, without involving anything bigger like hadoop. Saves time and makes for good consulting margins (I speak from experience).
It also happens to be the language which I use to develop fairly large applications, so I guess you could call me biased :-)