| HN Mirror

First of all, I am deeply involved with Julia, so be wary of any biases as I try to stay objective.

Lattner et al. outright says that "[We] picked Swift over Julia because Swift has a much larger community, is syntactically closer to Python, and because we were more familiar with its internal implementation details - which allowed us to implement a prototype much faster." and I only buy the last point – which, mind you, is an argument I would have made as well if I was sitting on a whole team specialised in a language. They acknowledge that “Julia is another great language with an open and active community”, so I would hardly say that your “unreliable” statement is shared by them.

Apart from the very strong practical argument made above, I think “Deployment” is really what it boiled down to. When Google launched TensorFlow, the general sentiment among many in the research community was “why”? As in, why this big move on the part of a major corporation? Back then I stated that the reason I saw was that for better or for worse, Google – and many others – see a future where Machine Learning (or Artificial Intelligence, if you prefer) would be embedded into almost any product. For this to be feasible, you need: know-how throughout the company, see the excellent TensorFlow tutorials/training, and the ability to deploy models across their entire spectrum of devices/environments. This is what it would mean to be an “AI first company”. One of the reason why TensorFlow at its core is a static graph, I would bet is because they saw deployment as key back then, as they do now.

However, TensorFlow is the very embodiment of the “two language problem” [1]; as anyone that tries to wrap TensorFlow in language X quickly realises, pretty much everything you care about on a high level of abstraction is written in Python and the underlying C++ API is about as bare bone as it gets. Fun fact, this is why TensorFlow.jl [2] goes through both the Python and C++ API. Now, add to this that PyTorch recently arrived on the scene and quickly became a very real “threat” to TensorFlow since dynamic graphs (eager execution) is far more intuitive to work with and enables you to express a richer set of models. However, it suffers from deployment issues since it is intrinsically tied to the Python interpreter. TensorFlow Eager was most likely a stop-gap measure and it was perceived to be clunkier than PyTorch, also after TensorFlow Fold being left to rot I sincerely doubt that many are willing to buy into yet another code branch to interoperate with TensorFlow “proper”. This all sets the stage for the TensorFlow Swift announcement.

[1]: https://julialang.org/blog/2012/02/why-we-created-julia

[2]: https://github.com/malmaud/TensorFlow.jl

From my academic perspective, this is a great time to be a Machine Learning researcher; we have plenty of options and the attention we are receiving from other software and hardware communities is amazing. As for Swift “vs” Julia, they both have their own issues.

Swift has a non-existing scientific computing community (this is why I find the “much larger community” argument disingenuous, as mobile application developers do not count in this particular case), they will have to build it entirely from scratch and community building is difficult. But they have the power of Google and its minions to work on the software itself (Hello DeepMind employees, have you gotten over the sour taste in your mouth from being forced to switch from Torch (Lua) to TensorFlow (Python) yet? I suspect Mountain View has ordered another meal for you!), perhaps this is sufficient to overcome the current lack of a community? Time will tell. There is also the problem of external collaboration that we have seen with TensorFlow, while it is open source the direction of the development is partially hidden which makes bringing in open source contributors trickier than necessary – think, no unified public forum where things are discussed.

Julia, while it has a (strong?) scientific computing community, it lacks static compilation from the perspective of “deployment” and is ultimately a team of ragtag researchers and open source contributors spread across the globe. Can they keep up? Again, time will tell. There is also the classic “issues” with Julia not being statically compiled, but I can see this swing either way depending on the overall direction in preference for static vs dynamic over the next decade.

Ultimately, they are both on top of LLVM and I suspect that much will be learnt from the other; there is room for more than one approach. My decision to side with Julia is partially to stay my own course, partially a preference for “the bazaar” development model, and partially because I have a hunch that Julia has a better chance to capture the scientific computing community as a whole which is likely to yield benefits down the line. As I said before, time will tell.