| > You realize this person that you are calling an expert claimed that they optimize software by putting tiny memory allocations into their tight loops right? You're either misunderstanding, or pretending to misunderstand for some reason. What I said was that allocating fresh objects and using those can be faster than re-using stale objects in some failed attempt to optimise by reducing allocations. Why would that be? For the reasons I explained: The newly allocated objects are guaranteed to already be in cache. Each new object is guaranteed to be close to the last object you used, because they're allocated next to each other. The new objects are not going to need any memory barriers, because they're guaranteed to not be published. The new objects are less likely to escape, so they're eligible for scalar replacement. You dismissed all that as 'throwing out terminology'. Here's a practical example: require 'benchmark/ips'
def clamp_fresh(min, max, value)
fresh_array = Array.new
fresh_array[0] = min
fresh_array[1] = max
fresh_array[2] = value
fresh_array.sort!
fresh_array[1]
end
def clamp_cached(cached_array, min, max, value)
cached_array[0] = min
cached_array[1] = max
cached_array[2] = value
cached_array.sort!
cached_array[1]
end
cached_array = Array.new
Benchmark.ips do |x|
x.report("use-fresh-objects") { clamp_fresh(10, 90, rand(0..100)) }
x.report("use-cached-objects") { clamp_cached(cached_array, 10, 90, rand(0..100)) }
x.compare!
end
Which would you think is faster? The one that allocates a new object each iteration of the inner loop? Or the one that re-uses an existing object each time and doesn't allocate anything?It's actually the one that allocates a new object each time. The cached one is 1.6x slower in an optimising implementation of Ruby. It's faster... but the only change I made was I added an object allocation instead of the custom object caching. I went from not allocating any objects to allocating an object and it became faster. This example is so clear because of the last factor I mentioned - scalar replacement. If you came along and 'optimised' my code based on a cargo cult idea of 'object allocation disastrously slow' you wouldn't be helping would you? |