Hacker News new | ask | show | jobs
by moru0011 4053 days ago
pointless. Its the number of objects not the size of primitive fields (e.g. the char array) which hurts GC and consumes memory. This proposal will save <10% on an average short string instance but probably waste performance.
1 comments

I assume this is motivated by heap analysis done by Oracle on their could/SAAS/... applications. They may have quite a few large strings in old gen. To give you an example, for every application deployed Tomcat builds an retains a 200kb String. Other candidates are SQL queries, manifests or in-heap caches.

But I agree with you on the performance side. My impression is that a lot of Java applications are simple data pumps. They read data from a database and send it to a client. It's hard to see how this JEP helps in that case:

- read bytes from the network (probably UTF-8)

- convert bytes to Java Strings

- compress Java String (ASCII, Latin-1, UTF-8) new

- "render" String to Writer (HTML, XML, JSON, ...)

- decompress Java String for Writer new

- encode to again UTF-8 for OutputStream/browser

In this case this would increase allocation rates and increase CPU load for a potentially smaller old gen.

The only real way to optimize this would be to redesign the String class to be encoding aware and update the Writer classes accordingly. This is unlikely to happen and would hurt other use cases.

Can this particular case not be solved by adding a constructor that doesn't compress the string?

Edit: "There are no plans to add any new public APIs or other interfaces.". :(

> Can this particular case not be solved by adding a constructor that doesn't compress the string?

Presumably if you use one of the byte[] constructors and the encoding is already in the compression format or something compatible then yes.

Whether you'll be able to do that depends on very much on how you implemented your IO. We're still seeing way to many String#substring in our traces after it become slow in 1.7.0_06. Some of them can be fixed easily, others not so much.

Agreed. The copy-on-substring behavior is a real pain, and I don't know if there's any workaround.
Not using String, eg. using CharBuffer (and #slice) or building your own. It's annoying and not always an option.
Yep, as you said, not always an option. No way to pass them to external things expecting strings, for one. Wouldn't be an issue, except that you can't extend strings.