I think this [0] is worth reading before starting a project in julia (it's quite shocking). Does anyone know if anything has changed in julia's development process over the last year?
1. Insufficient testing & coverage. Code coverage is now at 84% of base Julia, from somewhere around 50% at the time he wrote this post. While you can always have more tests (and that is happening), I certainly don't think that this is a major complaint at this point.
2. Package issues. Julia now has package precompilation so package loading is pretty fast. The package manager itself was rewritten to use libgit2, which has made it much faster, especially on Windows where shelling out is painfully slow.
3. Travis uptime. This is much better. There was a specific mystery issue going on when Dan wrote that post. That issue has been fixed. We also do Windows CI on AppVeyor these days.
4. Documentation of Julia internals. Given the quite comprehensive developer docs that now exist, it's hard to consider this unaddressed:
> The main valid complaints [...] the legitimate issues raised [...]
This is a really passive-aggressive weaselly phrasing. I’d recommend reconsidering this type of tone in public discussion responses.
Instead of suggesting that the other complaints were invalid or illegitimate, you could just not mention them at all, or at least use nicer language in brushing them aside. E.g. “... the main actionable complaints...” or “the main technical complaints ...”
* * *
> [...] I certainly don't think that this is a major complaint at this point. [...] it's hard to consider this unaddressed [...] fixed
After reading the original post and your responses, I think the responses come across as pretty smug and dismissive.
Third-party readers would probably be more optimistic if you just left it at “we’ve made a lot of improvement since then and we’re still working on it” or similar.
Sorry if it came off as smug or dismissive, that was not the intention. I was trying to be precise – we haven't fixed everything that Dan complained about, but some of those complaints were very subjective or things that I didn't think were fair and therefore wouldn't claim to have fixed.
I read the final line completely differently to you. I read it as saying that the aforementioned issues were definitely legitimate and have therefore been addressed.
However, like you, I noted that some of the complaints had not been accepted.
This article was pointed out to me, more than once now, as a user of Julia. However, I don't see that it has too much relevance to me as a user of the language, even if its quite outdated points were still valid today.
I am part of a team that now has over 60,000 lines of Julia code in our computer algebra package(s), and the intersection between our experience as users and the points danluu made is almost nil.
If one looks at the ranking of the language on Tiobe, it's quite obviously being used a lot for a language that hasn't even reached 1.0 yet.
I think the issues one is likely to have with Julia as an evolving language have completely moved on from those made in danluu's article. In fact, it might be useful for someone to write a "constructive criticism" blog article about the state of affairs today.
My personal opinion is that I would wait until 1.0 (which will arrive in less than 2 years time) if I were a fortune 500 company, unless you really need to be ahead of the game and are prepared to contribute to the development of the actual language itself. But for just about anything else, if you need the features Julia provides, it's probably vastly superior to the alternatives today.
Our experience is you will occasionally have to adjust some of your code to handle changes to the language prior to 1.0, and we've had to do that a few times so far (at most a few hours work each 0.x point release, even with our large, complex code base). And this mainly applies if you are really pushing Julia hard and exploring interesting corners of the language. Other than this, its more than stable enough for serious work.
My areas of focus are pretty different from that author's - e.g. I haven't tried contributing to the core language. But I've been using Julia for my research code for ~4 months now, and it's been an absolute joy.
The language allows super-readable code while remaining quite fast. The sort of research I'm doing involves interacting a bit with data then doing a bunch of simulations, and Julia excels at this. The expressiveness in how Julia handles anonymous functions, optional arguments, tuple construction/deconstruction, etc. allows for really concise configurations of how my little simulations run.
My only pain points:
- DataFrames are not as fast as the rest of the language.
- I'd like an infix function composition operator. (Possibly this exists, but I can't find it in the documentation).
No, x |> f is equivalent to f(x) while (f ∘ g)(x) is equivalent to g(f(x)). Specifically, x |> f results in a value (of the type of the return value of f) while f ∘ g results always results in a function.
I thoroughly enjoy julia and there's no other FLOSS to replace it, but a fair bit of that rant does still seem to stand (from my own experience).
Bugs are quite easy to come across and updating software feels more like a dice roll than a normal upgrade. Parallel computing in particular has been in a pre-alpha state for ages (which may be more of a documentation issue than an implementation one). Packages were previously very slow to load, though with pre-compilation this was partially fixed (~1 order of magnitude difference). I don't write robust software in julia, so I don't know how the error handling side has been evolving. The API inconsistencies have been getting fixed, but this typically results in broken packages until the compatibility package (Compat.jl) includes some workaround. The core code is difficult to get through and architectural-level documentation was absent last I checked.
Even with these development flaws it's still by and large an enjoyable experience to use. It solves some hard problems and it makes my own work (mainly DSP/ML) move a lot faster. I would recommend it, but as their versioning scheme indicates, there's no 1.0 release yet.
Given that syntax and core API changes still occur, it's quite amazing that Compat.jl can achieve a very high level of source code compatibility between versions. This and the solid deprecation system are some of the things which make me confident enough to start using julia in production.
Though it would have been fair to point out that this rant was written about his experiences the last time he used Julia, in Oct. 2014, nearly 18 months ago.
I, for one, much prefer tons of bugs if that's what's necessary to keep the language evolving. I like the fact that powerful features are rapidly being added to the language, even if it breaks backward compatibility. Most of my programming is scientific in nature, so I write lots of one-off scripts. "Fixing" these scripts for the newest version of Julia rarely takes more than a few minutes.
Considering that Matlab, which is supposed to be a stable platform, seems about as buggy as Julia with a lot of regressions, API changes, and new bugs every release, I, too, prefer Julia, as it's not only a nicer language with better programmer and execution efficiency, but it's also open source.
I'd rather try to fix a bug in some undocumented codebase than wait 6 months for a new black box.
The language works well for what is effectively still a beta. Sure I'd like more documentation and tests, but I'm happy to get features first. The alternative for me is trying to do some non trivial cluster computing in C or in python, either of which would suck.
It's an MCMC algorithm for a fancy kind of matrix factorisation.
I had a look at Spark, but its linear algebra packages seemed too limited (I guess abstraction comes at a cost). I can see that Spark would be nice if it does what you need out of the box.
Heard good things about Scala, is it straightforward to get a process on a remote machine to execute code?
> I had a look at Spark, but its linear algebra packages seemed too limited (I guess abstraction comes at a cost). I can see that Spark would be nice if it does what you need out of the box.
Did you look at MLlib and/or just using Breeze directly? There's a bit of awkwardness in the initial set up of the cluster (mainly just having LAPACK installed on all nodes, see https://spark.apache.org/docs/1.1.0/mllib-guide.html ). Spark itself is essentially just sugar to let you write a map/reduce in natural scala style and have it distributed across a cluster - it'll only work if you can factor your algorithm in a way that fits into that paradigm. (I've heard arguments that it's possible to do that with any distributable algorithm if you're clever enough, but I'm not sure I believe them).
> Heard good things about Scala, is it straightforward to get a process on a remote machine to execute code?
Honestly, no. I love the language but Spark is very much what I think of (perhaps unfairly) as typical scientific software. Spark clusters are finicky - they're cobbled together from a few unrelated projects (especially for cases where you need LAPACK as well), and it shows, especially when it comes to updating them. There are a few organizations like Cloudera (I think there was an open-source effort under the Apache umbrella somewhere too) that try to provide a working package, and various efforts with Puppet/Chef/etc. to automate the process of putting a cluster together, and it's certainly a lot better than it was even a few years ago, but a cluster still need at least a little bit of dedicated sysadmin time (or, at a bare minimum, a programmer with a bit of *nix admin experience who's willing to get their hands dirty - that was me at times) to keep it running reliably.
If you're part of an institution that already maintains a Spark cluster - or maintains an ordinary Hadoop cluster and you're friendly enough with the sysadmins to suggest they install it - it's wonderful. If you're having to do it all from scratch I won't lie, it's going to involve a lot of fiddling and may well not be worth it for your problem.
Most people don't need more than a handful of linear algebra operations (or think they don't), so Breeze and most wrappers of LAPACK or similar libraries don't implement or wrap them. But most people who work seriously on numerical routines will quickly run into performance problems if all they do is call LAPACK routines for general matrices instead of taking advantage of matrix structure.
I have yet to come across any other linear algebra library for any other high level language that provides the depth of integration available in the Julia base library. Want all eigenvalues of a symmetric tridiagonal 10x10 matrix between 1.0 and 12.0? Simply call T=SymTridiagonal(randn(10), randn(9)); eigvals(T, 1.0, 12.0). Or if you want to work closer to LAPACK, simply call LAPACK.stein!. I don't see a wrapper in Breeze or SciPy for this function. Want an LU factorization on a matrix of high precision floats? lufact(big(randn(5,4))). And so on.
Julia may not have everything users want, but its base library really tries to make matrix computations easy and accessible.
Something like the Scala type system seems like the best way to keep track of that kind of structure information and make use of it (perhaps even transparently). I can easily believe the current wrappers aren't there yet though. (Afraid I switched jobs six months ago and haven't been using Breeze or Spark since, so I can't justify working on it myself at the moment)
+1 this kind of issue is why I went for Julia - the support for lin alg (including with CUDA) is very good indeed.
The other issue being that Julia gives fine grained control over a cluster in a way something more abstract couldn't. (After cobbling together a scripting-style map reducer based on the default functionality - ClusterUtils.jl.)
Cheers, this was interesting. I think my 'try spark' button would get pushed if I had to do a big job using a standard method for a company e.g. some massive GLM.
I'm an enthusiastic julia user, observer and very minor contributor. IMO a lot of the issues in this constructive rant have been addressed to some extent. For context, here's the previous HN discussion https://news.ycombinator.com/item?id=8809422
Going through the post in order:
The stable releases still have some bugs as you would expect in a young language, but 0.4 is now well below my tolerance level. For a rough idea, I now use julia daily and encounter a bug perhaps once every one or two weeks. In 0.4 I haven't encountered any bug which was a real show stopper and couldn't easily be worked around.
Testing has gotten a whole heap nicer with a decent test framework in Base (accessible in 0.4 via BaseTestNext.jl). Testing and package manager integrate in a simple but effective way which really makes the friction for writing a suite of tests for new packages very low, much lower than other languages I've used. I can't speak for actual coverage in Base, but I know it's now actually being measured and work has gone into the coverage tools.
Consistent benchmarking is currently being addressed, with great work going on at BenchmarkTrackers.jl, and a proper setup with dedicated benchmarking hardware for the language itself. I don't have the depth of knowledge to comment on Dan's other complaints regarding skewed benchmarking.
Regarding contributing, my experience is that contributions to Base and the runtime by unknowns (myself, say) are generally met with the fair skepticism and good taste that all good maintainers should display. Sometimes I feel the core devs could do more to encourage new contributors, and the environment can feel slightly hostile when suggesting new features. I'm not sure how to entirely avoid this, when a core job of a good maintainer is to say "no" to a lot of poorly considered requests! Much of the code in Base is still commented in a minimalistic fashion, if at all. In contrast my experience in contributing to packages has been almost entirely positive, with a lot of excitement and energy leading to some really great code and interactions.
With precompliation, slow package load times have really been improved to the extent that they're no longer a major hassle, but there's still room for improvement here.
The real sting in the tail of this blog post is the paragraph about nastiness in the community. There was a couple of unfortunately worded (though not unambiguously malicious) mails on the julia-users list following Dan's post, but the discussion was largely constructive and helpful. I've no idea about the "private and semi-private communications" and I can only hope things were patched up there.
Overall I've found the julia experience almost entirely positive. It's a joy to work with for numerical and statistical problems, and we're moving forward at work to get our first major pieces of julia code into production.
> I've no idea about the "private and semi-private communications" and I can only hope things were patched up there.
I'm the co-creator that Dan was talking about. He wrote a bunch of less-than-charitable comments on the aforementioned semi-private forum – not specifically to me, but where he surely knew I would read them – to which I responded with:
You can judge for yourself whether I was nasty or dishonest. Things were, unfortunately, not patched up. Dan posted a number of responses, deleted all of them before I could read them, then left the conversation permanently.
Luu's claims made me hold off on pushing people to contribute to Julia as I waited for corroboration (or refutation) of them. I appreciate you linking to that very fair post that replies to it. The contrast between how the two of your present your claims adds credibility to yours. I also got a great laugh out of part about one guy that barely speaks English using the project as a personal Stackoverflow. A problem I'd have not anticipated starting a language/compiler project haha.
Curious, are you all still coding the internals of the compiler in femtolisp, is most of it written in Julia indirectly relying on that, or no LISP now? A barrier to entry question basically.
The parser and some lowering passes are still written in femtolisp. There has been some discussion of switching to the native JuliaParser package [1]. However, JuliaParser doesn't implement the fairly tricky lowering passes that the femtolisp parser does. I personally would prefer to have the parser in Julia, but at this point the most pressing issue with parsing and lowering is speed – so it's possible that the parsing and lowering will be converted to C instead.
Thanks for the update. Decent plan. An alternative would be to code it in SPARK Ada for speed and correctness. Has side benefit that each component done that way won't be touched by halfassed developers because they lack the will to learn it. Not quick n dirty enough for them. ;)
> I personally would prefer to have the parser in Julia, but at this point the most pressing issue with parsing and lowering is speed – so it's possible that the parsing and lowering will be converted to C instead.
Thanks for the link. After reading the extra context Dan's comment about the community is only more baffling than before. To me your response seems about as measured and straight forwardly honest as it could be, with a double dose of constructive investigation on the purely technical matters.
1. Insufficient testing & coverage. Code coverage is now at 84% of base Julia, from somewhere around 50% at the time he wrote this post. While you can always have more tests (and that is happening), I certainly don't think that this is a major complaint at this point.
2. Package issues. Julia now has package precompilation so package loading is pretty fast. The package manager itself was rewritten to use libgit2, which has made it much faster, especially on Windows where shelling out is painfully slow.
3. Travis uptime. This is much better. There was a specific mystery issue going on when Dan wrote that post. That issue has been fixed. We also do Windows CI on AppVeyor these days.
4. Documentation of Julia internals. Given the quite comprehensive developer docs that now exist, it's hard to consider this unaddressed:
http://julia.readthedocs.org/en/latest/devdocs/julia/
So the legitimate issues raised in that blog post are fixed.