|
|
|
|
|
by cynicalkane
3635 days ago
|
|
Sometimes you just need to allocate, whether due to necessity or expediency. If you make sure that "almost all" allocations are short-lived, GC is very fast. Allocation is bumping a pointer and cleanup is O(number of new, live objects). It's considerably faster than malloc/free for general-case allocation. |
|
Also don't forget that when GC runs it trashes your d/i-caches; temporaries/garbage allocs reduce your d-cache efficacy; GC must suspend and resume the java threads, which is trips to the kernel scheduler; there are some pathologies with Java threads reaching/detecting safepoints.
GC store barriers (aka card marking) don't have anything to do with thread contention (apart from one thing, which I'll note later). This is a commonly used technique to record old->young gen references, and serves as a way to reduce the set of roots when doing young GC only (i.e. you don't need to scan the entire heap). So this isn't about thread contention, per say -- with the exception that you can get false sharing due to an implementation detail, such as in Oracle's Hotspot.
The card table is an array of bytes. Each heap object's address can be mapped into a byte in this array. Whenever a reference is assigned, Hotspot needs to mark the byte covering the referrer as dirty. The false sharing comes about when different threads end up executing stores where the objects requiring a mark end up mapping to bytes that are on the same cacheline - fairly nasty if you hit this problem as it's completely opaque. So Hotspot has a XX:+UseCondCardMark flag that tries to neutralize this by first checking if the card is already dirty, and if so, skips the mark; as you can imagine, this inserts an extra compare and branch into the existing card marking code - no free lunch.