Hacker News new | ask | show | jobs
by bad_user 4013 days ago
Scala's minor versions don't break compatibility. A year ago Scala 2.11 (major) was being released, which means you're talking about either 2.10.x or 2.11.x, neither of which broke binary compatibility between minor versions. And major versions usually have source level compatibility, our upgrade from 2.10 to 2.11 being pretty smooth.

The fundamental difference is that Clojure gets distributed as source and not as compiled .class files. Being an interpreted language that gets compiled on the fly does have some advantages. But it also has drawbacks. Startup performance suffers, the Java interoperability story is worse, many tools such as Android's toolchain expect bytecode, etc ...

The problem that Scala has (and also Clojure, as soon as you do AOT) is that Scala's idioms do not translate well into Java bytecode, as Java's bytecode is designed for, well, Java. Therefore even small and source-compatible changes, like adding another parameter with a default value to a method or like adding a method to a trait, can trigger binary incompatibilities.

The plan for Scala is to embed in the .class, besides the bytecode, the abstract syntax tree built by the compiler and then the compiler can take a JAR and repurpose it for whatever platform you want. This is part of the TASTY and the Scala-meta projects. If you think about it it's not that far from what Clojure is doing, except that this is done in the context of a static language that doesn't rely on the presence of an interpreter at runtime. Of course, LISPs are pretty cool. And of course, you'll still need to recompile your projects, but at least then the dependencies won't have to be changed.

2 comments

Clojure itself actually is distributed as compiled .class files. We take some effort to ensure that there are not changes that break binary compatibility from AOT-compiled Clojure code (which is not uncommon) and break its calls into the Clojure compiler or runtime.
> and also Clojure, as soon as you do AOT

Not so much. You can do separate compilation with Clojure to a far greater extent than Scala (perhaps even completely). The Java interfaces of Clojure functions haven't changed in incompatible ways in a very, very long time.

The Java interfaces that Clojure relies on don't matter that much. On the other hand what is the bytecode representation of a protocol or of a multi-method?

Scala has an equivalent representation for everything it has. For example traits are just Java interfaces but with corresponding static methods. Another example is default parameters (a concept Java doesn't have) is done with method overloading. The Scala compiler has to infer Scala-specific stuff (like signatures using Scala features) from compiled bytecode. Functions that aren't methods (e.g. anonymous functions) get compiled either to static methods, or to classes that have to be instantiated (in case it's about closures closing over their context). Scala does not have the concept of static methods, but it has singleton objects, which can also implement interfaces. Of course, these get compiled to static methods, but there's also a singleton instance instantiated for those cases in which you want to use that singleton object as a real polymorphic instance. Etc, etc... So there's a protocol in place for how to encode this in the bytecode, there's a protocol for everything. And so even small changes in Scala's standard libraries can trigger big bytecode changes that end up being backwards incompatible.

Clojure doesn't have to do this, because Clojure dependencies get distributed as source-code, as I've said.

> Clojure doesn't have to do this, because Clojure dependencies get distributed as source-code

Clojure doesn't have to do this because it was designed to allow separate compilation as much as possible[1], and because it hardly ever changes binary representation in backwards-incompatible ways. Protocols and multimethods are indeed handled at the call-site, but in such a way that a change to the protocol/multimethods don't require re-compilation of the callsite[2]. Similarly, Kotlin, a statically typed language distributed in binary, is also designed to allow separate compilation as much as possible[3]. If a feature would break that property (e.g. require re-compilation of a callsite when an implementation changes at the call target), that feature simply isn't added to the language.

This is a design that admits that extra-linguistic features (like separate compilation) are as important as linguistic abstractions (sometimes more important).

BTW, separate compilation isn't only a concern with Java class files. Object file linking also places limits on how languages can implement abstractions yet still support separate compilation. Some languages place less emphasis on this than others.

[1]: In fact, as a Lisp, Clojure's unit of (separate) compilation isn't even the file but the top-level expression. No top-level expression should require re-compilation if anything else changes.

[2]: The implementation of protocols: https://github.com/clojure/clojure/blob/master/src/clj/cloju...

[3]: Even Java doesn't support 100% separate compilation. There are rare cases where changes to one class requires recompilation of another.

In my opinion choosing a different language than the host (in this instance Java) has to bring enough advantages to balance out the disadvantages, like less (idiomatic) libraries, less documentation, less tools, less developers available on the market. Not to pick on Kotlin, I'm sure there are people that love it, however I personally don't see what advantages a language like Kotlin brings over Java 8, being CoffeeScript versus Javascript all over again. With Clojure or Scala we are talking about languages going into territories that Java will never venture into, for the simple reason that Java is designed to be mainstream, which really means appealing to the common denominator. Of course, as we've seen with CoffeeScript, the market can be irrational, but it started to go away already, after a new version of Javascript plus the realization that everything brought by those small syntactic differences was bullshit.

Going back to Clojure and the need or lack thereof to recompile call-sites, given that Clojure is a dynamic language that doesn't have to concern itself with much static information and that does get distributed as source-code, I feel that this is an apples versus oranges discussion. But anyway, lets get back to protocols. So protocols do generate corresponding Java interfaces, just like Scala's traits. Except that Clojure being a dynamic language, there isn't much to do when your functions look like this to the JVM: https://github.com/clojure/clojure/blob/master/src/jvm/cloju...

There's also the issue that Clojure's standard library has been more stable. Well, that can be a virtue and in the eye of the beholder can be seen as a good design, however it has many things that need to be cleaned out. As a Clojure newbie I couldn't understand for example why I can't implement things that work with map or filter or mapcat, or things that can be counted, or compared, only to find out that its protocols and multi-methods aren't used in its collections and that there isn't a Clojure specific thing I can implement to make my own things that behave like the builtins. It's also disheartening to see that the sorted-set wants things that implement Java's Comparable. Clojure's collections have sometimes surprising behavior - like for example I might choose a Vector because it has certain properties, but if you're not careful with the operations applied you can end up with a sequence or a list and then you have to back-trace your steps (e.g. conj is cool, but the concept should have been applied to other operators as well IMHO). Transducers are freaking cool, however I feel that the protocol of communication isn't generic enough, as I can't see how it can be applied to reactive streams (e.g. Rx) when back-pressure is involved (might be wrong, haven't reasoned about it enough). As any other language or standard library, Clojure is not immune to mistakes and personally I prefer languages that fix their mistakes (which I'm sure Clojure will do).

EDIT: on "separate compilation", not sure why we're having this conversation, but generally speaking Scala provides backwards compatibility for everything that matches Java. Adding a new method to a class? Changing the implementation of a function? No problem. On the other hand certain features, like traits providing method implementations or default parameters are landmines. Along with Java 8 things might improve, as the class format is now more capable. As I said, there's a roadmap to deal with this and with targeting different platforms (e.g. Java 8 versus Javascript versus LLVM) through TASTY, but I don't know when they'll deliver.

Re collections: There are Java interfaces you can implement to make your own collections that work as built-ins. In general, nothing in the Clojure compiler or runtime is written based on concrete collection classes, only on the internal interfaces. It's perfectly feasible to write a deftype that implements Counted and Associative and whatever else and make your own things that work with all the existing standard library. There are plenty of examples of this.

Re sorted-set: Any 2-arg Clojure function that returns -1/0/1 comparable semantics will work.

Re staying "in vectors": The subtle blurring of the sequential collections between the collections and sequences is part of what makes so much of working with data in Clojure easy. However, there are certainly times when you want more control over this; fortunately in 1.7 with transducers you have the ability to control your output context (with functions like into or conj) and this is easier now than ever.

Re rx: several people have already implemented the use of transducers with Rx. I know there's a RxJs and I'm pretty sure I saw something re JavaRx although I can't put my finger on it right now.

Hey, please consider my rants to be of good faith, plus I'm a rookie :-)

On Rx, the step method is synchronous. It works for the original Rx protocol and it works for RxJS because RxJS has implemented backpressure with the suspend/resume model (e.g. in the observer you get a handle with which you can suspend or resume the data source), but otherwise the model is pretty much like the original, in that the source calls `onNext(elem)` continuously until it has nothing left, so it is pretty much compatible with transducers. RxJava started as a clone of Rx.NET, but is now migrating to the Reactive Streams protocol, which means that the observer gets a `request(n)` function and a `cancel()`.

And this changes everything because `request(n)` and `cancel()` have asynchronous behavior, very much unlike the step function in transducers. And furthermore, implementing operators that are asynchronous in nature requires juggling with `request(n)` and with buffering. For example the classic flatMap (or concatMap as called in Rx) requires to `request(1)`, then subscribe to the resulting stream and only after we're done we need to request the next element in the original stream.

Then there's the issue that the protocol can be dirty for practical reasons. For example `onComplete` can be sent without the observer doing a `request(n)`, so no back-pressure can happen for that final notification and if the logic in the last onNext is happening asynchronously, well this can mean concurrency issues even for simple operators. Although truth be told for such cases there isn't much that the transducers stuff can do.

I might be wrong, I'd love to see how people can implement this stuff on top of transducers.

> for the simple reason that Java is designed to be mainstream,

Not really. Java was designed to be practical, and it ended up being mainstream.

Most languages want to become mainstream (including Scala), most of them just fail to achieve that goal for a variety of reasons, often due to their design decisions (but not always).

> Most languages want to become mainstream (including Scala), most of them just fail to achieve that goal for a variety of reasons, often due to their design decisions

A language will often become used in one particular domain, often not the domain it was designed for. Take another JVM language Groovy -- it's used a lot for scripting throwaways used in testing, including build files for Gradle, and was extended with a meta-object protocol for Grails which has risen and fallen in sync with Rails. But then its backers retrofitted it with a static typing system and have tried to promote it for building systems, both on the JVM and Android but virtually no-one's biting.

I disagree, Java was designed to be marketable to the layman and for a very good reason ...

Software companies can scale either horizontally (e.g. hiring more people) or vertically (hiring the best). Because of the classical bell curve distribution, really good people are hard to find, therefore horizontal scalability is very much preferred (and obviously not everybody can have "the best"). Unfortunately you can't really grow a company both horizontally and vertically. The reason for it is because getting good results from mediocre people requires lots of processes, coordination/synchronization and politics of the sort that doesn't appeal to really good people and that's because really good people have a choice, because of the supply-demand equation which favors them.

In the wild you'll also notice another reason for why horizontal scalability is preferred. Corporate departments and consultancy companies are usually scaled horizontally because they get paid per developer/hour and unfortunately there are only 24 hours in a day. Hence the bigger the team, the bigger the budget. On the other hand small teams happen in startups where you often see 2-3 people doing everything. Small teams also happen in big software companies, when speaking of either privileged seniors or folks that handle real research.

Big teams are getting things done, as Alan Kay was saying, on the same model as the Egyptian pyramids, with thousands of people pushing large bricks around. Small teams on the other hand tend to handle complexity by means of better abstractions. Unfortunately better abstractions are harder to understand, as they are first of all, unfamiliar.

So back to Java ... it's actually a pattern that you tend to notice. Such languages are afraid of doing anything that wasn't proven already in the market. The marketing for such languages also relies on fear and doubt of the alternatives, which Sun took advantage of, appealing to the insecurity of the many. This is one language that refused to implement generics because they were too hard and then when they happened, the implementation chose use-site variance by means of wildcards, with optional usage, possibly the poorest choice they could have made. Generic classes are invariant of course, except for arrays. This is the language that doesn't do operator overloading, as if "BigInteger.add" is more intuitive than a "+", but on the other hand it does have a + for Strings. This is the language in which == always does referential equality, so you've got tools converting every == to .equals because that's what you need most of the time. The language which until recently didn't have anonymous functions because anonymous classes should be enough. This is the language in which everything needs to be in a class, even though the concept of "static method" makes no sense whatsoever. This is the language in which at least 30% of all libraries are doing bytecode manipulation to get around the language's deficiencies. Such languages also avoid metalinguistic abstractions like a plague, such as macros, because OMG, OMG, people aren't smart enough for that (though truth be told, you're expanding your language every time you write a function, but don't tell that to the weak). Also, AOP is a thing that has a name in Java, which is pretty funny, given that in other languages it tends to be effortless function composition.

As an environment it turned out to be great, eventually. If you want another shinning example, take a look at Go. Same story.

In other words Clojure or Scala will never reach the popularity of something like Java. Sorry, people can wish for it, but it will never happen. And that's OK.

Sorry to jump in on an interesting discussion but I got interested in this part:

> It's also disheartening to see that the sorted-set wants things that implement Java's Comparable.

Could you quickly explain what's disheartening about Java's Comparable interface and what the available options are?

In a static language like Scala or Haskell you usually work with a type-class, which is pretty cool because you can provide implementations for types you don't control (not implying that Scala's Ordering is perfect). Instead of Java's Comparable interface I was expecting a protocol, which are almost equivalent to type-classes, or a multi-method.

Of course, in many implementations you usually also get a version of a constructor that uses a provided comparison function. However certain things, like plain numbers or strings, have a natural ordering to them, hence the need to have the sorted-set in Clojure also work with Java's Comparable. But again, I was expecting a Clojure specific protocol.

The reason for why this happens, I believe, is because protocols weren't there from the beginning, being a feature introduced in Clojure 1.2. From what I understood, in ClojureScript protocols are "at the bottom" as they say, so there's hope :-)

Maybe a mitigating factor is that any Clojure 2-arg function extends AFunction which implements Comparator. So any Clojure function that returns -1/0/1 will work transparently. Example, use - as a comparator:

=> (sorted-set-by - 2 9 3 5 4) #{2 3 4 5 9}

> given that Clojure is a dynamic language

Yes, that helps in this case, but that doesn't mean being statically compiled allows you to disregard extra-linguistic downsides.

> however it has many things that need to be cleaned out.

No! There are things that could have been better; sure. But they don't need to be cleaned out because that would break backwards compatibility. Backwards compatibility is a very, very, very important thing. So much more important than a "perfect implementation" (something that can never be achieved anyway). This is something that is very hard for PL purists to understand, but extra-linguistic features and guarantees trump 100% consistency almost every time.

> It's also disheartening to see that the sorted-set wants things that implement Java's Comparable.

That's only disheartening if you're a purist. If you care about extra-linguistic concerns, such as Java interoperation, this is a mark of great design. You see how the language designers took into account concerns outside the language itself. Every Clojure map is a Java map and vice versa. Every Clojure sequence is a Java collection and vice versa. Every Clojure sorted set is a Java sorted set and vice versa. This is great design around constraints and what Rich Hickey always talks about. The language itself is no more important than how it fits within the larger ecosystem and its constraints.

> as I can't see how it can be applied to reactive streams (e.g. Rx) when back-pressure is involved

Clojure is an imperative language first and functional second. It is even more imperative than Scala (in spite of Scala's mutability) -- at least with Scala's (recent-ish) emphasis on pure-functional concepts (a terrible idea, BTW, but that's a separate discussion). Back-pressure is therefore automatic (and implicit) with pull-based channels (that transducers can be applied to).

> I prefer languages that fix their mistakes

Because you're a purist. I prefer languages that see the big picture consider what's important in the grand scheme of things, and realize that code is important but secondary to working programs. As Rich Hickey is not a PL researcher, I am sure that those mistakes will only be fixed if they don't break compatibility or Java interoperability, which are more important for the industry than a perfectly clean language.

> not sure why we're having this conversation

Because supporting separate compilation -- if not perfectly then at least treating it as a top-priority concern -- is one of the key ways to ensure binary compatibility between runtime-library versions.

> On the other hand certain features, like traits providing method implementations or default parameters are landmines.

That's exactly my point. Separate compilation (as many other crucial extra-linguistic concerns) is just not a priority for Scala. Kotlin, OTOH, implements default arguments on the receiver end rather than at the call-site, so changing default values in the target doesn't necessitate recompiling the caller.

I guess I must have been dreaming when seperate compilation fell apart in Koltin recently due to backward-incompatible changes.
That's just the power of good upfront design.
This doesn't seem true. See http://comments.gmane.org/gmane.comp.java.clojure.user/85930

"Binary compatibility across releases is not guaranteed, but it is something we strive to maintain if possible. "

If you read what it says, they've changed an implementation of a function which may change ordering of unordered collections, and that may break some people's code if they relied on some specific ordering. They are not talking about breaking interfaces. In fact, they specifically say they are not aware of any binary incompatibilities. Their "strive to maintain" works out very well in practice, so far.
I wrote that sentence and you should take it as written. In general, we strive to maintain binary compatibility (old AOT-compiled Clojure code should continue to run on newer version), we do not exhaustively test for or guarantee that. At times we have broken this in alphas etc and we take it as a high priority to fix such things.