Hacker News new | ask | show | jobs
by physguy1123 2882 days ago
In my experience, writing java (or groovy here) in c++ results in horribly slow code which the jvm runs circles around, and it sounds like that's the problem your employee ran into.

> But for all the applications where the high performance code is in niches at the edge and there simply aren't resources or expertise to fully tune the native implementation

It's interesting you say this, because in my experience it's the JVM which requires absurd amounts of tuning and native programs which are much more consistent. The proper and easier way that native programs are written lends itself to fairly respectable performance, mostly because the object and stack model of say C or C++ is so much friendlier to the CPU than in most dynamic languages.

In general, for all that I hear statements along this line, I've only twice seen code to back it up, and the C was so de-optimized from the OCaml version that I suspect it was intentional - the author (same for each) was a consultant for functional languages, and in one case switched the C inner loop to use indirect calls for every iteration and in the other switched the hash function between the C and functional comparison.

3 comments

In addition, a lot of the techniques used to write high-performance Java boil down to "write it like C". Avoid interfaces, avoid polymorphic virtual calls (as you can't avoid virtuals entirely), avoid complex object graphs, avoid allocating as much as possible...it's not nearly as nice as naive Java. Still nicer than C IMO. If your process segfaults you can know for certain that it's a platform bug.
The other thing that makes Java nicer than C is the ease and depth with which you can profile it to discover where the bottlenecks actually are. While it's certainly possible to profile in both cases, the runtime reflective and instrumentation capabilities of the JVM really add a lot of power to it.
There's this classic paper from Google that runs an optimisation competition on the same program written in C++, Java, Scala and Go:

https://days2011.scala-lang.org/sites/days2011/files/ws3-1-H...

This is great benchmark of the fundamental problems with say Java - the code itself is fairly simple and the JITs probably generate optimal code given their constraints, but the performance problems clearly show that the GC and pointer chasing really hinder your performance.

If you add in cases where simd, software prefetching, or memory access batching help, the difference will only grow.

It’s not native vs VM, but rather “has stack semantics/value types” vs “no stack semantics/value types”. In particular, OCaml’s standard implementation is a native, not VM.

Also worth calling out Go, which is rather unique in that it has stack semantics but it also has a garbage collector, so it’s kind of the best of both worlds in terms of ease of writing correct, performant code.

Go is not rather unique in having GC and stack semantics, there are plenty of languages that have it, all the way back to Mesa/Cedar and CLU.
I should have been more clear I guess; I was comparing it to other popular languages. Few have value types and many that do (like C#) regard them as second-class citizens.
But go has an imprecise GC (in reference implementation) or stack maps (in gccgo), so the GC overhead is rather huge. It also lacks of compaction, so cache misses are not that good too.
Not sure what you mean by imprecise, but Go’s GC does trade throughput for latency. The overhead still isn’t huge if only because there is so much less garbage than in other GC languages. I’m also surprised by your cache misses claim; Go has value types which are used extensively in idiomatic code so generally the cache properties seem quite good—maybe my experience is abnormal?
>Not sure what you mean by imprecise

It's a rigid term:

https://en.wikipedia.org/wiki/Tracing_garbage_collection#Pre...

perf shows how much time does GC eat, and that's quite a lot. Thus in the majority of benchmarks go lags behind java or on par with it at best.

>there is so much less garbage than in other GC languages

That is not true since strings and interfaces are heap allocated thus the only stack allocated objects are numbers and very simple structs (i.e. which contains only numbers), so you would have a lot of garbage unless you are doing a number crunching, which could be easily optimized by inlining and register allocation anyway.

> It's a rigid term

Ah, neat! I learned something. :)

You’re mistaken about only numbers and simple structs being stack allocated. All structs are stack allocated unless they escape, regardless of their contents. Further, arrays and constant-sized slices may also be stack allocated. I’m also pretty sure interfaces are only heap allocated if they escape; in other words, if you put a value in an interface and it doesn’t escape, there shouldn’t be an allocation at all.

Both arrays and interfaces are heap allocated. Slice is just a pointer to a heap allocated array.

Structure could be stack allocated, but any of it's fields would not if there is anything but a number.

A trivial example:

https://segment.com/blog/allocation-efficiency-in-high-perfo...

    func main() {
            x := 42
            fmt.Println(x)
    }

    ./main.go:7: x escapes to heap
So a trivial interface cast leads to allocation.
Regarding your last point, Crystal has the same features as go in that regard, while at the same time being vastly more expressive. This mostly due to the standard library in Crystal being so nice for work with collections (which perhaps isn't surprising as the APIs are heavily influenced by Ruby). Blocks being overhead free is another necessary part for this to work well.
Yeah, I often find myself wishing Go's type system were a bit better, but the reason I prefer it is because it's fast, easy to reason about, and the tooling/deployment stories are generally awesome (not always though--e.g., package management). So far I'm only nominally familiar with Crystal; I'll have to look into it sometime.
.NET is another example of value types in a garbage collected language. It’s also somewhat unique afaik in doing so within a VM.
Definitely. I’m sad that they’re not more idiomatic in C#. I definitely prefer values and references over OOP class objects.