Hacker News new | ask | show | jobs
by QuackingTheQ 1491 days ago
I've spent a lot of time developing large computational codebases in Julia, and I think the most insidious of these issues is a product of no formal way of enforcing interfaces. Using one of the common packages to build a trait system and add some sort of guarantee that all the right methods are implemented for a given trait simplifies maintenance dramatically.

This doesn't catch mathematical bugs, but those crop up everywhere. Instead, knowing what the interfaces must be specified so you can trust your implementation is crucial, and being able to know when it is invalidated is invaluable.

I've had a few awful bugs involving some of the larger projects in this language, but a proper interface/trait system would simplify things exponentially. There are some coding style things that need to be changed to address this, like using `eachindex` instead of `1:length(A)` for array iteration as the example in the article points out. However, these should be one-off lessons to learn, and a good code linter should be able to catch potential errors like this.

Between a good code linter (or some static analysis, I'm pulling for JET.jl) and a formal interface spec, I really think most of Julia's development-side issues could be quelled.

2 comments

I agree with the kernel of your point here, but also with the author of the article when he says "But systemic problems like this can rarely be solved from the bottom up, and my sense is that the project leadership does not agree that there is a serious correctness problem. They accept the existence of individual isolated issues, but not the pattern that those issues imply."

My impression is that the Julia core devs are more focused on functionality and being able to construct new, more powerful, faster capabilities than on reflecting on how the foundations could or should be made more rigorous. For this, I think the devs have to philosophically agree that soundness in the large should be a first-tier guiding principle, and that the language should have mechanisms whereby correctness-by-construction can be encouraged, if not enforced. Presently, notions of soundness seems to only be considered in the small, such as the behavior of specific floating point ops. Basically, I don't think the core devs are as concerned with soundness, rigor, and consistency as they are with being able to build more impressive capabilities.

I don't want this to sound like I'm ungrateful for the awesomeness that Julia and its ecosystem does bring to the table. For numerical computing, I don't see any alternatives whose tradeoffs are more favorable. But it is disappointing that it doesn't seem to learn the lessons about rigorous language design and the language-level implications for engineering vs. craftsmanship appropriate for a twenty-first century language.

Sounds like Julia needs a Snow Leopard/Mountain Lion/High Sierra release - no new features, just cleaning things up...
Could some of the need for interfaces be addressed by providing an extensive test battery for types of object? It seems like if something claims to be an implementation of a floating point number it should be possible to smash that type into every error ever found to uncover implementation errors.
It's possible to hack interface verification into place at test-time, but that has a couple of problems:

1. Running the whole testing framework to determine if you implemented an interface is a high overhead when you're developing

2. You have a lot of tests to write to really check every error. Perhaps a package which defines an interface could provide a tester for this purpose

3. Interfaces should be attached to the types, and that should be sufficient for verifying the interface

I would settle for something like checking for the implementation of methods a la BinaryTraits.jl over what we have now, which is nothing. A huge step would be documentation and automated testing that proper interface methods are implemented, not even verifying if they're "correct". This drastically reduces the surface area you need to write and check to confirm compatibility with outside code.

This simple interface specification does produce design issues of its own, but correctness is much easier to handle if you know what needs to be correct in the first place.

Yes, although that seems like the easy half of this, making sure `struct NewNum <: AbstractFloat` defines everything. There aren't yet tools for this but they are easy to imagine. And missing methods do give errors.

The hard half seems to be correctness of functions which accept quite generic objects. For example writing `f(x::Number)` in order to allow units, means you also allow quaternions, but many functions doing that will incorrectly assume numbers commute. (And not caring is, for 99% of these, the intention. But it's not encoded anywhere.) Less obviously, we can differentiate many things by passing dual numbers through `f(x::Real)`, but this tends to find edge cases nobody thought of. Right now if your algorithm branches on `if det(X) == 0` (or say a check that X is upper triangular) then it will sometimes give wrong answers. This one should be fixed soon, but I am sure there are other subtleties.