| Not sure who is doing big data with Erlang, or why anyone would want to Nokia created an open-source Hadoop-replacement called Disco [0] that used Erlang for coordination/orchestration -- an underappreciated strength of the language -- of map-reduce jobs, where the jobs were written in Python (and later OCaml, etc.). They've shown that it can handily outperform Hadoop (at least in the canonical wordcount example shown in this talk[1] -- there may be other examples, I haven't actually watched the talk yet). They've used it to mine terabytes of logs, daily, as described in this talk[2] and others apparently have used it as well. From the abstract[3] describing the first talk, about the project: We will describe our experiences using Erlang within Nokia to build Disco, a lean and flexible MapReduce framework for large-scale data analysis that scales to large clusters and is used in production at Nokia. Disco is an open-source project that was started in 2008 when attempts to use Hadoop to analyze data proved to be a painful experience. The MapReduce step formed only a portion of the analytics stack, and it was felt that it would be faster to write a custom implementation that would integrate well, than adapt Hadoop with the amount of internal Hadoop expertise available. Among the crucial tasks of such an implementation would be to deal with cluster monitoring, fault- tolerance, and the management and scheduling of a large number of concurrent and distributed jobs. To keep the implementation simple, the use of a platform that provided first-class support for distribution and concurrency was imperative. This motivated the choice of Erlang/OTP to implement the core control plane of Disco. It bears stressing that this choice was driven primarily by pragmatic concerns, as opposed to any beliefs about the superiority of functional programming languages in general or Erlang in particular. The project's homepage [0] has information, a link to its Github, etc. [0] http://discoproject.org/ [1] https://youtu.be/IjOGUC-iR_Q [2] http://vimeo.com/23550705 [3] http://cufp.org/2011/disco-using-erlang-implement-mapreduce-... |