They also seemed to have avoided libraries like numpy.
In my mind, python made it possible to hardware optimize with libraries like numpy quite easily. Avoiding it is a mistake. I'll try to see if I have time to play the game myself and throw my attempt in there.
Everything that makes Python efficient is written in C or C++ or the like. Python in these situations is just a glue language (with an optional interactive layer) that makes using these libraries more feasible.
Well, the Java source uses threads. Guess how that's implemented.
FWIW if python provides an abstraction that keeps the code readable while keeping the efficiencies of C, I think that should count for python, not against it.
As a Fortran user, it's pretty awful for string handling. It's fine for numerics. However, there's a lack of libraries for non-numeric stuff (e.g. data structures). The free compilers are buggy, too.
Are you sure you've looked inside all those .dlls and .sos you use every day?
At the end of the day Fortran is compiled and it can just create dynamic libraries which program you use depend on, and you'd never know.
Heck, I wouldn't bet against using me using at least 1 Cobol library once per month, you never know what kind of craziness is going on behind the scenes.
Banks have all kinds of random legacy crap written in all kinds of random languages. While COBOL is a lot more common, I guarantee you there are plenty of banks with bits of Fortran in their code bases. It is particularly found in older code for economic modelling, etc
Back in the old days, a lot of apps were written in Fortran that would never be written in that today. I used to work for a university where the application used to determine whether a student had met the graduation requirements for their degree was written in Fortran. Why Fortran? It was a manual process, then one of the professors offered to automate it for them, and he wrote the app in Fortran, because that was the language he was most comfortable with. And 30 years later they were still running it (although they finally replaced it with a COTS package when I worked there)
I once worked with an insurance company for whom a key business application was written in Turbo Pascal for DOS. They wrote it back in the 1980s when everyone had DOS machines. By the 2010s they were still running it in a VM. For all I know they are still doing that today
I bet you're right and that I'm misremembering. I learned about it in the context of fronting mainframes with graphql so I wasn't thinking too much about the backend language.
Another poster pointed out I was probably misremembering cobol. Either way, your example was google searches. You don't search millions of times a day so you have to be talking about backend functions? The banking system could easily hit a million function calls in a day related to you or data you care about.
One should always be skeptical of these sorts of benchmarks, especially against Java, and I say this as primarily a Java developer.
Well optimized Java is very lean and can be incredibly performant, but the median modern Java app is an obese cronenberg made out of gigabytes of SpringBoot dependencies. That's not particularly fast; and well optimized python will run circles around it.
This is an important point, real life performance looks nothing like benchmarks in most cases.
From my personal experience: benchmarks showed Jruby was MUCH MUCH faster the 'normal' Cruby. But try running a basic rails application on jruby and it's 10-100x slower, even after minutes of repeating the same request.
(disclaimer, this was a few years back and purely anecdotal)
I did an excercise some time ago and managed to get two file transfer apps, one in Rust and one in Java, where the performance was effectively the same (in some cases with a slight Java edge!), but the memory consumption was orders of magnitude in favor of Rust. The JVM is a wonder of engineering, and Java is indeed quite fast.
Most reasonable people in the industry understand that Java can be fast, especially in benchmark-oriented code. The problem is that the OOP-first ideology of the language and the coding culture surrounding it is not performance-oriented, especially with modern machines that don't like pointer-chasing.
One not very well known fact is that just-in-time compilers have more optimisation-specific info than pure ahead-of-time compilers, i.e. most used code paths, real types used for polymorphic values, etc. That is, until your AOT compiler uses something like a profile-guided optimisation.
In practice though writing to performance in Java takes a very dedicated effort. Real Java programs are horrible with memory, okay (for an AOT languguage) in long-running compute-intensive tasks, and painfully slow at short-living tasks such as CLI utils.
Almost any real program written in C would run circles around most off-the-shelf Java programs.
Turns out, though, all of that doesn't really matter :-)
What I noticed is that you have a lot of tools available to tune Java applications, but it was more necessary to rely on them to get to a good spot. In the Rust version, I got a 2x speed improvement by using a rayon thread pool instead of tokio, but the improvements in the Java version were 100x by the end of it.
But nothing beats Vtune for squeezing the lemon of a tight loop of C code beyond any reason :-) Java just doesn't lean itself well to that kind low-level code wrangling. One misstep - and the performance house of cards falls apart.
I don’t think there is much point to these discussions without at least fixing some parameters/problem domain across languages. Of course pointer chasing is worse than a well-thought out encoding that fits many objects into a cache-line and is sequentially accessed. The question is — can the problem at hand use such a structure in the first place? Because not every problem can be solved with ECSs and the like - hell, I would even go as far to claim that most programs have a small hot loop, and otherwise spend their time slowly crawling through vast amounts of code and data in a non-orderly fashion. What part is prone to nice, sequential access is traditionally handled by the DB, which does its work very well. For the resulting 10 lines that has to be iterated through, it doesn’t really matter if it’s pointer chasing, especially when it has to wait for IO at every step in a typical server application.
So if we actually compare similar problems at hand, for a certain kind the difference is negligible, and is well offset by safety, productivity, maintainability. Hell, due to having safer primitives, one might be able to take advantage of better parallelism, that easily makes up for the slower single-core performance.
I agree with most things you say. This is why most DBs are written in C relatives or descendants: C++, Rust, or just good old C. And Java dominates in the server-side business logic domain.
But boy it's cumbersome in anything other than those long running jit-friendly server-side processes..!
PS for IO-heavy purposes almost any mem-managed language would do. The real code running there is OS figuring out hardware and physical reality.
They both boiled down to a sendfile call, effectively. The differences came to runtime weight, and parallelism strategy/implementation. It turns out that not having to pay for object headers/stack allocation by default, helps a lot more than I anticipated. I did this to actually measure what the difference was.
No man spring boot is hideously slow as well compared to vanilla java. It's pretty fast compared to many other things, but that's just because vanilla java is extremely fast.
You often see like 15-30 second start times for springboot apps, which is hilarious and 13-28 seconds longer than it has any business taking.
Any java performance test is generally done against a warmed up VM - which would be representative of the 99.99% of the time the app is running. With or without spring boot.
I've tried and worked with most of what the Spring ecosystem has on offer, and I'm not particularly impressed. It seems to codify all the practices that makes for mediocre software.
> Micronaut is a software framework for the Java virtual machine platform. It is designed to avoid reflection, thus reducing memory consumption and improving start times. Features which would typically be implemented at run-time are instead pre-computed at compile time.
Actually I ported it to Micronaut too and compiled it to native code. It's difficult to know exactly what the minimum amount of memory that a java application could run with during normal load, but I'd estimate that it was around 150 mb instead of 200 which was disappointing, 128 mb got intermittent OOMs. But the code was nicer than spring boot (or rust).
You should get your hands on say, go and go-fiber, and compare that with spring boot regarding speed and memory. And I say that with our main backend being kotlin/java and spring boot. It's a nightmare and absolute and utter garbage.
Amazing. I have seen this list popping up in my LinkedIn feed for more than a year already and it keeps coming back.
While any benchmark is of course interesting when considered on its own, the conclusions that people draw from this list tend to be complete nonsense: "Python is bad for the environment", "we should switch to language xxx to combat global warming" etc.
First of all, especially for the slower languages in this list, it is extremely rare that application code written in that language is the bottleneck of a performance critical application. Typically the bottlenecks of every day applications tend to be databases and network. If you need heavy number crunching, there tend to be excellent libraries available written in lower level languages, such as e.g. for Python NumPy and Pandas.
If you are really concerned about the energy footprint of your application, you're probably better off with optimizing these components. Or optimizing the higher level architecture of your own application, which tends to be a lot easier in a higher level langguage.
Second, coding in a faster language is often not economically possible. Programming the same application in C will normally take much longer than coding it in Python. Developing time costs money and can also mean loss of opportunity.
And finally, the 'greenest' language of them all was not even included in this list: assembly. I guess the author must have realized in the back of his head that such a difficult language wasn't a realistic option for switching to a more energy efficient solution. He/she just failed to realize that this argument applies to many other languages as well.
> "we should switch to language xxx to combat global warming" etc.
These takes fail to take into account the huge discrepancy between the market value produces by computer systems per unit of energy. It is such a huge value that it barely matters compared to most other industries, plus the net amounts are not particularly high either. Nonetheless, it can matter in certain cases, and Java is a great choice for server setups due to its GC being very efficient in doing only the necessary amount of work.
Also, I’m not convinced that a full on complex assembly app would be leaner than the same program written in a lower level language. That’s an area where compilers are very good, and humans only have so much working memory/hair on their head to hand-optimize whole programs.
I would add that proof of work algorithms (Bitcoin type) where the goal is to endlessly compute hashes with no real computational productivity gain are the real offenders, regardless of which language they're written in.
Most of the implementations for LLMs runtimes I know of are Python using PyTorch. I believe most of the heavy lifting is written in C++ (llama.cpp), though.
And given that a major limitation of bitcoin-hashing is the cost of electricity, I would guess that they're written in a very energy-efficient language already.
I think you bring up a good point. Our economic system isn't designed to reward low energy consumption. In fact, GDP and energy usage are very tightly linked. And because we are seeking growth in GDP at a system level, we are doomed to use more energy.
Strange, and yet we keep using more energy every year. Fun fact, we burn as much wood for heating as we did 100 years ago. What ends up happening is we consume all available energy. Economic growth is limited by energy so the whole system is incentivized to GROW energy usage, not reduce it.
I suspect you are talking about per capita energy usage. Per capita energy usage has stayed around 80 mWh per person since 1965 in the US.
But what's important isn't the per-capita usage, but aggregate usage. Because guess what, the climate doesn't care about per-capita CO2.
You have to look at it from a system level, not individual actor level. Our whole political and economic system is designed to have exponential GDP growth. And because energy is a limiting factor in GDP growth, then our whole political and economic system is designed to extract as much energy as possible to keep growth from hitting that limit.
>Fun fact, we burn as much wood for heating as we did 100 years ago. What ends up happening is we consume all available energy. Economic growth is limited by energy so the whole system is incentivized to GROW energy usage, not reduce it.
That's because the population has increased but poverty hasn't substantially decreased, and the poorest people on the planet tend to burn wood for heating instead of using electricity.
Energy consumption is a result of labor applied to tools and technology for production. It is not mere correlation, but it is also not the cause, it is one measurable downstream effect.
China, in fact, used energy consumption as the measure for GDP reporting. For this reason, local leadership in cities and provinces would game the system by running the coal plants at peak and turning on all lights in a city day and night to juice the numbers so they'd show higher GDP (Beijing used to do this before they hosted the Olympics, for example).
Available energy is a limiting factor in economic growth. More available energy allows for more economic growth. While a decline in economic growth for other reasons (financial crisis, pandemic) reduces demand for energy. So it's both a cause and effect. It's a feedback loop! Asking it it's causation or correlation is the wrong question. Does the temperature in the room effect the thermostat, or the thermostat effects the temperature in the room? It's a feedback loop!
In other words, when potential economic growth buds up against energy supply, energy prices rise and they rise non-linearly. This causes a reduction in potential economic growth. If there is a decline in economic growth, then demand falls below supply and prices drop. As prices drop they enable more economic growth.
In other words, the economy grows as much as energy supply allows. A 10% decline in available energy means a 10% decline in GDP. It really is that linear.
Sometimes I wonder if that's true long term. The amount of bugs, security issues, and code churn surrounding C code bases is pretty wild. If everybody and all their deployments has to recompile a C code base 100 times over the course of three years to get patches, fixes, etc it makes me wonder what the real cost of C is?
My intuition states that this isn't the case, but intuition can be wrong.
Just write in Rust, it's easier to build good software than either of these languages (yea, yea, do you data analyst stuff in python until you start using POLARS). At this point it's a far superior ecosystem from a developer experience point of view and the fact it's going to get you as close to efficient as possible without you thinking too far into it is a welcome side effect.
Using VScode with the Sqlx package I get compile time errors on my SQL - as a result I haven't ran a binary with a typo from my SQL in the last 12 months.
Easy? No. Rust is a far more difficult language to work in than Python.
It will require you to write more correct code, so the end product is more likely to come out better. But it is not easy, even for building "good" software. (Assumption: "good" := correct, maintainable, efficient, and robust -- Rust nails the efficient part.)
Are there any good interactive programming options in Rust? That's the main draw of a dynamic language like Python for ad hoc data science tasks, the semantics just lines up well with the highly iterative nature of the work.
I know I saw earlier in the year a Jupyter notebook for rust - but I haven't used it personally so I'm not sure how nicely it will play. I use jupyter+python to view and play with the data usually and then write my actual job in rust with Polars to use as the ETL or whatever i'm doing with it at that moment.
Sincerely doubt it does because that's a difficult thing to measure. If you are writing a lambda function at a FAANG company you do way more than break even with a language that has a heavy compilation time. If you're fixing a comment for a hobby project that doesn't ship binaries that you and your nephew use, definitely not.
The deployment and consumption of said code matters a lot for that.
This is why I'm learning Rust: C# gives fast development speeds than Python with decent performance. Python and Go for when their native libraries smoke C# and now Rust for performance or low level.
Smoke as in have better support for specific (niche) use cases. Python has strong numeric libraries and Go has better support for modern/interesting/crazy cryptography systems than C#/Java.
So in reality there are very few places where you would want to use Go over modern C#.
While Java's compiler or JIT is probably one of the most optimized softwares in existence, how do we measure, what overhead the language nudges the programmers to build? I am thinking of AbstractProviderFactoryProxy and similar. Or that for a very long time it forced you (or still does to a degree) to put everything into classes, breeding 1 or 2 generations of programmers, who see a noun and jump to making a class and then subsequently initializing the class to be able to call a method, which actually would only need to be a standalone function.
It probably does not balance the 38x, and probably the JIT optimizes parts of it away, but surely the cost is far from zero.
Also what about the computational resources it takes to just run a JVM, compared to running a Python program?
Please explain how "AbstractProviderFactoryProxy" is related to the language.
I have taken over as lead for a Python project and what I see all the Java problems people used to make fun of (none have worked in Java before) and worse. On top, everything has an interface starting with "I", C# style. Methods in entities, service classes are instantiated with state etc. And side-effects are all over the place.
If you are writing modern Java syntax, there is little that is more stable and readable than good modern Java syntax.
> Please explain how "AbstractProviderFactoryProxy" is related to the language.
The language Java did, until some time ago, not allow you to pass functions as arguments. You had to make an anonymous inner class for example, satisfying some interface. Many design patterns are very much oriented towards a language like Java, which does/did not allow for higher order functions.
Take the visitor pattern for example. It will get you a "Visitor" in your class name implementing a visitor. How would it work in other languages, which provide higher order functions or always did so? Well, you would simply write a function and depending on whether your language has static types, annotate its types for arguments and return value. You would then pass this function in as an argument, whose name can be "visitor".
Take a factory pattern as an example. How would it work? Well, a factory can be expressed using a function, that returns another function. It takes the arguments, that specify/narrow down how the returned function works. In Java it would become a Factory class instead. Instead of using a simple function, one needs (needed?) to build a whole class around that, because one could not return a function. So one would build that class and then make it return an object instead, which of course again must be derived from some class ...
Just 2 examples that quickly come to mind.
> I have taken over as lead for a Python project and what I see all the Java problems people used to make fun of (none have worked in Java before) and worse.
It is certainly true, that people also write shit code in Python. I have seen very popular libraries, which wrap REST APIs in objects. It is silly, because objects are meant to have some lifetime in which they communicate with each other and possibly change their state. There is no changing state though. Everything is a response from the REST API, which by the definition of REST should transfer the representative state in its responses. I don't want any in between stored state. I want the state that the API gives me. It is very much a more functional view on things. But Python programmers will go "make classes!" nevertheless. Because noun. Because not thinking about whether they really need a class. Because thinking that one approach fits it all and the only approaches they learned were procedural at the beginning of their learning and then OOP, which is complex enough to take years to learn properly, so that is where they stopped.
However, Python always (for a very long time?) has allowed you to pass procedures as arguments and as such does not encourage "AbstractProviderFactoryProxy" as much as Java does. Still, sometimes former Java developers try their hand at some Python code ...
> If you are writing modern Java syntax, there is little that is more stable and readable than good modern Java syntax.
Modern Java certainly has improved. But there is a lot of relearning to be done for those generations of Java programmers, to get rid of the "every noun a class" mentality.
You are misunderstanding the Visitor design pattern. It is not replaced by higher order functions — the point of it is to emulate multiple dispatch (over the usual single dispatch most languages have), that is, to change method implementation based not only on the receiver, but the argument as well.
(It is replaced/has an exact analog with pattern matching though. Thankfully, Java has that available nowadays as well).
> Represent[ing] an operation to be performed on elements of an object structure. Visitor lets you define a new operation without changing the classes of the elements on which it operates.
And that is exactly what you can do with higher order functions (or maybe procedures, if your visitor involves a side-effect). Instead of passing a "Visitor" object/instance, you pass in a "visitor" function/procedure. It too can achieve the goal of enabling to implement "a new operation" outside of "the class" (or whatever one uses instead of a class). "Operation" probably also being conceptually closer to function than object, since an operation is about "doing something", while an object is not necessarily actively doing something other than live.
By writing a function that accepts a function as an argument, you leave open the possibility for some other part of the code to define that function (a new operation). If you need to add another operation, you just define a new function. That you can pass in as an argument.
How is a class that holds a static method/function any different to simply namespacing? Never understood people’s problem with that. For example, what’s the problem with Java’s Math “class”? It only has stateless math functions like any other language, nothing is forced on you.
It does not have to be a problem in terms of it working or not.
The problem is rather, that a class comes with loads of conceptual baggage. It is conceptually not fit for the purpose of merely namespacing.
A good language will have a concept fit for the purpose of namespacing. Either something directly called "namespace" or something that is meant to namespace things, like modules. A class is rather for grouping methods that interact with the state of the objects.
Of course, if you don't have anything else available, you gotta take what you have. It will not serve making the code more readable though, as a potentially new reader of the code will expect a class to be instanced somewhere, as is typical for classes. It can lead to confusion. If there was however a concept "namespace", they will immediately know that it is for grouping a category of things.
True partly because shared_ptr is a relatively new language feature; I’ve been locked to a ~decade-old C++ compiler for most of my professional career, and so has everyone I know, just because support for a current compiler across platforms at any given time has been historically so spotty. It seems to be improving at the moment.
The C++ doesn’t use shared_ptr or STL or anything, and isn’t that far from the C code, but it isn’t that close either. I wonder if it’s OMP configuration that’s causing it to be slower, or maybe just instruction cache or something since the C++ is larger.
The results say this one has C++ going 50% slower. I’m a little skeptical it’s the compiler’s or the language’s fault. I’d speculate it might have more to do with how the program is written.
Yes, the link grabbing seo claims the benchmarks game has something to say about Java and energy-efficient — in fact the benchmarks game does not measure energy use.
No, the link grabbing seo provides the correct source for Table 4 — "Source: Energy Efficiency across Programming Languages, SLE’17" — and google finds the article:
JS has a JIT compiler, so for long running programs it can very well execute native machine code in the exact same way as C (plus a GC occasionally meddling with things). Also, it’s not even a bad JIT compiler, huge amount of development went into making it good.
Standard Python is interpreted all the time, going from instruction to instruction giving a layer of abstraction that never disappears.
EDIT: fwiw, classic Fortran is twice as fast as java :P As is Rust, but that's less funny to me