What this says to me is that Java's reputation for sluggishness is a result of its idiomatic coding styles more than the language itself. In particular the casual way people allocate and discard objects for everything they do.
If you program it like it was C then you can get good performance, which make sense given that the language was built for embedded devices with anemic processors. Of course you undoubtedly give up some maintainability when you do that, but the tradeoff is getting 6 million packets through the thing per second.
This would also explain why Java benchmarks so well but still tends to be slow in the real world.
Java's reputation for sluggishness was really formed based on people's experience with Applets and Swing applications and especially badly written Applets and Swing applications.
Java in the real world is fast. That's why it's used as the backbone of so many large organizations and so many scale out solutions (Cassandra, Kafka, Hadoop, etc) are written in Java.
Yes, I'd take this in conjunction with https://news.ycombinator.com/item?id=17824575 : "language speed" and "UI responsiveness" are only loosely related, and there are so many ways to end up wasting time blocking or locking without really realising it.
> Java's reputation for sluggishness was really formed based on people's experience with Applets and Swing applications
... and contemporary interactive performance.
python < /dev/null 0.01s user 0.01s system 96% cpu 0.025 total
racket < /dev/null 0.17s user 0.02s system 98% cpu 0.198 total
java 0.09s user 0.02s system 94% cpu 0.118 total
clojure < /dev/null 2.07s user 0.06s system 170% cpu 1.247 total
There's of course a theoretical point about being shoehorned into the UNIX execution model, and if Java were able to run as a persistent OS (eg nailgun or whatever it is these days) then things get much better. But still, when you start off a project having to suffer and design workarounds for the language's and implementation's flaws...
But in Java it always seemed that you are very much punished for having data structures that are at cross purposes to your dominant work load. I cut my teeth on implementing the last two parts of “make it work, make it right, make it fast”, often on projects where the existing team had declared that everything that could be done already had been done. There are a lot of refactorings that accomplish both goals, and I often got a 2-3x out of these projects by removing slow tech debt, and more by exposing a real info architecture.
It always surprised me that a language that so punished (especially in the early days where it was interpreted and all object lookups were double indirect) the Big Ball of Mud antipattern exhibited so many examples of it, so frequently.
On one of the projects I had trouble convincing the company to stay on Java for their application when they were displeased with the performance.
The previous developers were just careless/clueless about performance and when it started becoming a problem they cited good patterns they were following and blamed Java for their problems. The tech lead wanted to rewrite it in C++ which would be suicide IMO.
What I did, I created a graph in a form of horizontal bar which showed color-coded parts of transaction processing. The color codes showed different parts: business logic, frameworks, infrastructure, communication, etc. I then marked a different graph showing which parts of this was unnecessary with some notes of how this can be optimized. You guess, the parts that were not easily optimized away were very hard to find.
In the end we stayed on Java achieving almost two orders of magnitude performance improvement.
Just adding on that this is my experience as well.
Objects usually fell into one of two categories: discarded immediately or held for the entire application lifetime. Anything in between was problematic. Most things ended up using object pools. We also used to never really convert anything from binary format and just used wrapper objects to access the byte arrays directly.
Another sort of trick was scheduling GC for times when the application was OK to pause helped considerably, and made behavior more predictable as well.
It'd be tough to compare it to C/C++ given the complexity of the application. But without giving away specifics, we had solid performance afaik. But you're correct that it does end up making for some interesting Java code.
C was built for a minicomputer which might or might not be the brains of a phone switch, a software development environment, a word processing cluster for a newspaper, industrial control for a steel mill, etc.
The thing about Java is that it has promised and delivered a much better
threading experience than other programming languages. When somebody proved the Java memory model was unsafe, Sun fixed it. C and other languages have adopted essentially this memory model but a decade and a half later. Java provides bulletproof tools in forms of Executors, Latches and other specialized concurrency constructs. It takes time to learn to use them, but you can ship fast and correct code for something like the LMAX Disruptor.
Actually, I think that talking about Java's multithreading here is a red herring; note that LMAX had to abandon multi-threading because it was destroying performance.
And this is important, because they explicitly made a very interesting point of current architectures being at odds with current thinking about best practices for concurrent programming.
The other thing is that the JVM was sluggish when Java was at peak hype. The steady performance improvements in the decades since have made a big difference.
It's not, but this is a thing people have said since the early days of Java which is still going around the internet. I personally use Java every day and have yet to hear someone complain about the speed of our backends.
>It's not, but this is a thing people have said since the early days of Java which is still going around the internet. I personally use Java every day and have yet to hear someone complain about the speed of our backends.
That's because for backends it doesn't matter. The DB layer or network will hide any slowdown anyway.
But for Desktop apps the latency due to GUI overhead and GC pauses can be from mildly annoying to unbearable.
Every time I fire up an Apache Tomcat and it burns several gigabytes of memory to somehow run a simple web service very slowly...
A more famous example might be Minecraft, where even with its blocky graphics it can tax a high end gaming machine when you turn the view distance up to a range that almost no other engine would consider long. The engine has been rewritten in other languages where it is much faster, notably the Microsoft version and the Phone version.
Isn't the memory use itself a big problem on modern architectures? Or has it gotten better lately since CPU clocks have been relatively flat and memory clocks have been creeping up? The problem with the "allocate-and-discard" model of programming in the past is that it thrashes the hell out of the cache, which means lots of trips to main memory, which is slow on modern machines.
When you get an article like this going "holy shit, where have you been all my life circular buffers" it emphasizes the point. You wanna go fast you have to avoid invalidating cache as much as possible.
It doesn't sound like you understand how memory Management in Java works. In Java, you have a defined heap size. Java will claim memory in order to support that heap. If you are unhappy about how much memory it is using, you can change the heap size.
Or use ShenadoahGC ;) and it will uncommit heap not in use. Or wait for http://openjdk.java.net/jeps/8204089 to land.
In any case vastly improving idle RSS consumption.
Memory usage is still terrible with Java but at least there are plans to address part of it with value types etc. But for a web service performance is pretty good (just look at techempower benchmarks and the fact that most high scale companies use Java for a significant part of their infrastructure). If you need to talk about startup time you don't need scalablitiy or really want a different solution/architecture.
"If you program it like it was C then you can get good performance" -> Note that one of the points of the post is that just using "like C" won't magically fix your performance!
If anything, you could say "If you program it like a system with limited memory then you can get good performance". That, I can imagine being a universally applicable thing.
My point is that, for example for the particular case of memory management, there is an associated cost, whatever your language is; and you need to deal with it. Ignorance of the law is no excuse, etc.
The lack of value semantics for composite types is a huge problem in the language itself. It is often hard to get a Java program to have the same memory layout you would get writing idiomatic C/C++.
This is the kind of thing I really love reading. It’s ostensibly about a technical subject, but that’s really a jumping-off point for examining how we think and talk about things, how portions of our field become Balkanized, how what we encounter shapes our sense of what’s possible and ultimately hardens into “common knowledge”.
When it was discovered by the Mathematics community it was met with bemusement. Sorry, I'm recalling from memory here from years ago when I was a graduate student in Mathematics and don't have ready posts.
If you program it like it was C then you can get good performance, which make sense given that the language was built for embedded devices with anemic processors. Of course you undoubtedly give up some maintainability when you do that, but the tradeoff is getting 6 million packets through the thing per second.
This would also explain why Java benchmarks so well but still tends to be slow in the real world.