|
|
|
|
|
by dusenberrymw
3854 days ago
|
|
Great question. I'm part of the committer team for the project at IBM, so I'll leave a few comments representing our thoughts. As a quick overview, SystemML provides an R-like DSL, called DML, consisting of linear algebra primitives (vectors, matrices), built-in functions for common functions (such as sums, means, matrix construction, etc.), UDFs, etc., as well as a compiler/optimizer engine that can generate optimized runtime plans from the same DML script for a single node (laptop), Spark, or Hadoop MapReduce. We definitely have algorithms already available as production-ready examples, but the goal of the project is to allow for declarative ML using customizable scripts written at the mathematical DSL level, rather than to provide a fixed library of algorithms at the base language level (Scala, Python, etc.). MLlib (including the newer ML API) is awesome, and provides a great set of algorithms that fit in quite well with Scala, Python (& Java). SystemML is great in that it provides the ability to run customizable, linear algebra-based ML scripts (that can be automatically optimized within the engine) on Spark. Together, it's a great combo. We also have an API for Scala that lets one embed DML into a Scala program similar in manner to how an SQL script can be embedded [http://sparktc.github.io/systemml/mlcontext-programming-guid...]. Here are our new Apache links: https://systemml.apache.org https://github.com/apache/incubator-systemml |
|