Hacker News new | ask | show | jobs
by GlitchMr 1460 days ago
I don't think using String::intern makes sense, especially now that Java's garbage collector is capable of deduplicating strings (https://openjdk.org/jeps/192). In the past it could have been used to reduce memory usage when a given string was used a lot, but now there are better ways of dealing with that issue.
1 comments

G1's deduplication is nice, but note that G1's deduplication is a lot weaker than what String.intern does. G1 deduplicates the underlying byte array, but leaves separate strings (so s1 == s2 will evaluate false). So you still have an extra object header.

If you have (like one our applications did) millions of copies of the string "USA" in memory, that's many megabytes of memory that explicit deduplication can save that the garbage collector can't.

String.intern isn't the way, for all the reasons this post outlines, but just using G1 isn't the right approach either.

IIRC all of the concurrent GCs can dedupe now. Not just G1.

Hopefully soon object headers will be negligible with progress from Lilliput though.

Strings have an object header, an int for the hashcode and a pointer to the array. Assuming a < 32 GB heap (so 4 byte pointers), that's 24 bytes for the string, even once the array is deduped. Lilliput is awesome, but an 8 byte header would only reduce that to 16 bytes.