Hacker News new | ask | show | jobs
by MaxBarraclough 2925 days ago
I've heard theorists say exactly this a number of times before.

The reason Java isn't a 'pure' object-oriented language is simply performance.

Suppose everything - even every int - is heap-allocated. You now need a very sophisticated JIT compiler (as in, better than any we have today), or it's going to run dog slow. Having a huge number of needless allocations happening at every step is going to:

1. Slow things down by doing vastly more heap allocations that you otherwise would (even with Java's ultra-fast allocations)

2. Slow things down by doing violence to your code's locality and cache behaviour, because your ints no longer live in the stack

3. Slow things down by doing violence to your code's locality and cache behaviour, because Java objects are bloated compare to raw ints

4. Slow things down by hugely increasing garbage-collection pressure

If Gosling had taken that route, we wouldn't be talking about Java today.

3 comments

> Suppose everything - even every int - is heap-allocated. You now need a very sophisticated JIT compiler (as in, better than any we have today), or it's going to run dog slow.

Most dynamic language implementations (JavaScript, Lisp, Smalltalk, even Ruby) use a tagged pointer representation allowing integers (and sometimes floats) to be encoded directly in the reference, avoiding heap allocation in this common case.

Another alternate model is to pass the type of a value separately from the value itself, and allow the value to be of variable size.

Java simply made the wrong tradeoff, and while it wasn't fully apparent at the time, there's no good defense of that decision today.

Most dynamic language implementations don't let me inline a bunch of 128-bit or 256-bit values in an array, or allocate them on the stack.

Code that deals with a lot of values which are small but larger than a machine word can be made a lot more efficient if there is a way to treat those values as not objects.

That is true, but the original poster was talking about heap allocation of simple integers, not larger values.
> tagged pointer representation allowing integers (and sometimes floats) to be encoded directly in the reference

I don't see how that could work for Java. Every Java object can be used as a mutex. This strikes me as very silly, but it's part of Java. Incidentally .Net went the same way, and I'm not the only person who thinks it was a silly decision for both frameworks [0]

Java also tags objects with type information for runtime checking, but that would play ok with tagged pointers as you're describing, as far as I can see.

Seems to me .Net generally has the right idea on types. Primitives are not objects, but you can do List<int> without autoboxing.

> Another alternate model is to pass the type of a value separately from the value itself, and allow the value to be of variable size.

Wouldn't that bloat the stack considerably? Wouldn't it be better to have a type-system that eliminates the need for that sort of thing?

> Java simply made the wrong tradeoff, and while it wasn't fully apparent at the time, there's no good defense of that decision today.

Are there any modern frameworks at all similar to Java, that do as you describe? Dog-slow dynamic languages aren't really the same beast.

[0] https://stackoverflow.com/a/282342/

Agreed, but they could have had a bit more value types love and AOT on 1.0 days, given the ongoing research of GC enabled languages for systems programming all the way back to CLU and Mesa/Cedar.

Oh well, at least in couple of years we will have them.

Agree. This is something .Net does well.
Languages do not need a 1 to 1 relationship between the storage medium of a value and the interface it exposes to a programmer.

Java's situation is even worse because it's a compiled language that does not need a JIT or even much sophistication from a compiler to keep a single hierarchy on its type system.

I suspect the reason Java did it was to not surprise C++ programmers. Solely dictated by marketing, not by technical reasons.

> Languages do not need a 1 to 1 relationship between the storage medium of a value and the interface it exposes to a programmer.

Sure, but that doesn't excuse the well-documented 'sufficiently-smart-compiler fallacy'.

The performance improvements that can be had by the escape-analysis/object-inlining family of JIT compiler optimisations, are considerable, but even today, production JVMs don't do a very good job. It's not an easy problem to solve well.

> I suspect the reason Java did it was to not surprise C++ programmers. Solely dictated by marketing, not by technical reasons.

I sincerely doubt it. You're wrong to dismiss the performance question.

> I suspect the reason Java did it was to not surprise C++ programmers. Solely dictated by marketing, not by technical reasons.

C++ is actually more uniform than Java in this respect because it allows one to define new value types, and also allows heap-allocation of "primitive" types such as int.

> Languages do not need a 1 to 1 relationship between the storage medium of a value and the interface it exposes to a programmer. ... I suspect the reason Java did it was to not surprise C++ programmers. Solely dictated by marketing, not by technical reasons.

The situation is almost the opposite of what you described it. If anything one of the major design points was to solve the surprises inherent to C/C++.

Take integers for example. In java they're defined to be 2s compliment; the processor architecture doesn't matter. C/C++ left the spec open to let the language be 1 to 1 with the medium, Java did not.

You can see the proposals to close this headache: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p090... http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2218.htm

This is similar to all the issues with how threads interact with memory. Java lend the way in making the memory model a part of the language specification rather than allow it to be implementation (and architecture) defined.

http://www.theregister.co.uk/2011/06/11/herb_sutter_next_c_p...

> If anything one of the major design points was to solve the surprises inherent to C/C++.

Sure, but you're speaking past each other.

* Java was designed to be more predictable and have fewer dark corners than C++ (no undefined behaviour, precisely defined primitives and generally far less platform-specific behaviour, etc)

* Java was designed to feel familiar to C++ developers in order to aid adoption (specifically its syntax)

These aren't in contradiction.

> Take integers for example. In java they're defined to be 2s compliment; the processor architecture doesn't matter. C/C++ left the spec open to let the language be 1 to 1 with the medium, Java did not.

There is a historical reason for this. In the early 1970s, when C was first designed, non-twos complement machines (such as CDC and UNIVAC machines) were still an important part of the industry and so it made sense for C to be designed to allow supporting those machines. By the 1990s, when Java was designed, non-twos complement machines were much less relevant, so it made sense to exclude support for them from the design of Java. Now finally, in the late 2010s, when the relevance of those machines has shrunk even further (although ones-complement Unisys mainframes still exist even today), it makes sense to remove that support from the C standard, even though it made a lot of sense when C was first designed.

> I suspect the reason Java did it was to not surprise C++ programmers. Solely dictated by marketing, not by technical reason

java reference vs value distinction is extremely surprising for C++ programmers

> Languages do not need a 1 to 1 relationship between the storage medium of a value and the interface it exposes to a programmer.

Java isn't just a language but also a VM specification.

Sure. So what?

In both Java and Java bytecode, primitives are not objects.