This is correct; the standard approach here is to use regular c-style memory management for the data the system is managing, and the JVM heap only for the database "infrastructure".
This hybrid approach gives the benefit of a managed runtime and safety of GC for most of your code, but allows the performance of raw pointers/malloc for key code paths.
Hah :) I don't mean that malloc itself is fast, I mean that having non-jvm heap memory is fast.
A permanent memory block on the JVM heap can't use pointers to refer to it, since GC moves objects around. And even though those blocks will never be collected, they make up additional work for the GC to track.
Hbase is going offheap as much as possible. Voltdb uses java for management and c++ for low-level.
They will write c++ in java eventually. Depending on how much performance you REALLY need.
The same for elasticseach, if you want performance you need to do the same thing scylladb did to cassandra (per-core-sharding, skip filesystem across cores etc)
In elasticsearcch terms, vespa.ai, which claims better performance/scalability/maintanability uses c++ for lucene layer and java for the solr/elasticsearch layer.
There are blog posts speeding lucene by 2x+ by changing some stuff to c/c++. There are libraries (trinity) claiming 2x+ performance .
There is google-engineer saying "bigtable is 3x faster than hbase" that I've read.
Everything is relative, I guess.