Hacker News new | ask | show | jobs
by pron 3949 days ago
> I can sidestep gc and carefully control memory layouts for cache friendliness

Memory layout and GC are two completely orthogonal issues. You will be able to control memory layout quite well with Valhalla (value types) and even on a finer-grained level with Panama if you need C interoperability. VarHandles (hopefully in Java 9) will give you safe access to off-heap memory. Currently you can do that with Unsafe, which is more work but still less than C++.

> What the jvm really buys you is not having to build once for each platform

Oh, I'd say it buys you a lot more: seamless polyglotism, exceptional performance even for dynamic stuff (dynamic languages, esp. w/ Graal, but even cool bytecode manipulation in Java or even simple code loading/swapping), and you get all that performance with unprecedented observability into the running platform.

2 comments

Value types will provide ability to allocate storage embedded in heap object or stack, but it doesn't provide layout control (i.e. order of fields in the layout). It's a good change, but let's not exaggerate.
As the requirement was "layout control for cache friendliness" value types are all you need (or 99.99% of what you can possibly need). For interop, there's Panama. Let's not nitpick.
99.99% is perhaps your estimate, but not necessarily others. This is also not likely what people would consider "layout control" if they're coming from a language that allows field-level layout control. Being able to place frequently used together fields manually is quite useful in quite a few circumstances.
> Being able to place frequently used together fields manually is quite useful in quite a few circumstances.

I have almost never seen this make a difference outside of, say, GPU programming. The fact that Java's optimizer is much better than that of Go will make a much larger difference in execution speed.

This is mostly an issue for large objects (i.e. span multiple cache lines), but have different access patterns for various fields (i.e. clusters of fields accessed together).

The other aspect of layout control is cacheline padding, which is also not present in the JVM. There's @Contended, but it's a blunt tool and not currently a public API (it's in sun.misc).

>The fact that Java's optimizer is much better than that of Go will make a much larger difference in execution speed

Yes, but that's orthogonal.

Both grouping and padding are possible with Java's value types. The only thing you don't have full guaranteed (i.e. on multiple JVMs) control over is ordering and alignment.
But value types do let you group fields (with sub-component values).

Also, I've noticed that whenever I say Java does X, you say, "Oh, no! It does X - ε!" Now, to me, that's nitpicking, especially considering that a perfect general-purpose language/runtime designed to be simple (for some definition of simple) should give you 90+% performance in 99% of general-purpose use cases (or 95% in 95% etc.). If it does any better then one of two possibilities is true: 1/ it's magic, or 2/ it's not a perfect simple language/runtime because it could have been made simpler (by whatever definition of simple it's chosen).

Anyone who can't settle for anything less than 100% performance or does something that's outside 95% of the use cases knows not to use such a general-purpose language/runtime, and, instead, uses a more domain-specific language/runtime or one that's not designed to be simple.

>But value types do let you group fields (with sub-component values).

They let you treat the fields of a value type as a "blob", you have no control over how they're laid out within that blob.

>Also, I've noticed that whenever I say Java does X, you say, "Oh, no! It does X - ε!" Now, to me, that's nitpicking, especially considering that a perfect general-purpose language/runtime designed to be simple (for some definition of simple) should give you 90+% performance in 99% of general-purpose use cases (or 95% in 95% etc.). If it does any better then one of two possibilities is true: 1/ it's magic, or 2/ it's not a perfect simple language/runtime because it could have been made simpler (by whatever definition of simple it's chosen).

Nothing personal, but I find your JVM related posts as borderline fanboyism (and I say this as someone that greatly respects the engineering in Hotspot, despite certain things bugging me). It's not about 100% or 95% performance; it's about not making exaggerated claims since, as you say, there's no magic.

> I find your JVM related posts as borderline fanboyism

I think they're much less fanboyish than any other language/runtime discussion on HN.

> it's about not making exaggerated claims

I am not making exaggerated claims because perfect for me (and, by definition 95% of people) is precisely how I've defined the requirements from a runtime like Java (or Go, or Python, for that matter). So, if I say "it gives you all the layout control you need", I don't mean all the layout control you need to write a DSP, or all the layout control you need to get 100% performance -- only 99% performance. I think most people make the same assumption (because they're not writing DSPs), and I find your posts nitpicky and possibly misleading for the target audience (who, by and large, don't write DSPs or medical devices or particle accelerator beam controllers -- at least certainly on threads not discussing those interesting but extreme use cases). I don't think it's reasonable to qualify every statement for some negligible (possibly nonexistent) portion of the participants, especially considering that that minority already knows the statements don't completely apply to them. Not doing so doesn't qualify as exaggeration IMO, but reasonable expressiveness, or else every discussion will be bogged down in irrelevant detail that will only distract from the main point.

So, yes, I stand behind my claim that value types give you all the control you need to get 99% performance for 99% of the use cases people would normally use Java for.

X - ε, for very large values of ε.
X - ε, for very large values of ε.
I think I disagree about the observability -- vtune is a lot easier to use when just tuning straight C++ rather than java
Are you familiar with JMH's perfasm? http://psy-lob-saw.blogspot.com/2015/07/jmh-perfasm.html

And for profiling apps on production, I've yet to encounter a more thorough, low-overhead profiler than Java Flight Recorder.

no but i'll check it out this afternoon. Thanks!
VTune is not too dissimilar to JProfiler.

And as pron mentioned plenty of tools exist for lower level access.

Vtune gives access to PMU counters as well as attributing them to assembly. JProfiler is a purely java level profiler (it won't even tell you Hotspots in the JVM itself, nevermind assembly). They're not really comparable.
jprofiler, at least for my use cases, isn't really similar to vtune at all. I know what my hot spots are: it's the inner bits of algorithms that run a few billion to a few trillion times. What I need to do is understand, as granularly as possible, the exact instructions and how the various caches and memory are operating. Convex and tree optimizers are generally memory speed limited and my goal is to have this code run at eg 0.9+ of memory b/w speed.