| Necessary depends on your use case! I spend a lot of time waiting on analyses, and the more operations I can test, the more bugs I can find. I probably invest more time in performance optimization than most people. Regarding your specific example, uh, I don't know how you measured, but that feels... off. Here: ```
package scratch; public class Sum {
public static long sum(long[] longs) {
long sum = 0;
for (int i = 0; i < longs.length; i++) {
sum += longs[i];
}
return sum;
}
}
``` ```
(require '[criterium.core :refer [quick-bench bench]])
(def longs (long-array (range 100000000)))
(defn sum-clojure [size] (reduce + 0 (range 0 size)))
(import 'scratch.Sum)
``` Summing an array of 10^8 longs like you suggested takes ~120 ms on my machine. ```
user=> (quick-bench (Sum/sum longs))
Evaluation count : 6 in 6 samples of 1 calls.
Execution time mean : 119.804166 ms
Execution time std-deviation : 9.124709 ms
Execution time lower quantile : 115.154905 ms ( 2.5%)
Execution time upper quantile : 135.597497 ms (97.5%)
Overhead used : 20.384654 ns Found 1 outliers in 6 samples (16.6667 %)
low-severe 1 (16.6667 %)
Variance from outliers : 15.4197 % Variance is moderately inflated by outliers
``` Your Clojure function, which sums a lazy range, takes ~4.7 seconds--about 40x slower. ```
user=> (quick-bench (sum-clojure 100000000))
Evaluation count : 6 in 6 samples of 1 calls.
Execution time mean : 4.739575 sec
Execution time std-deviation : 346.895187 ms
Execution time lower quantile : 4.557732 sec ( 2.5%)
Execution time upper quantile : 5.324739 sec (97.5%)
Overhead used : 20.384654 ns Found 1 outliers in 6 samples (16.6667 %)
low-severe 1 (16.6667 %)
Variance from outliers : 15.3163 % Variance is moderately inflated by outliers
``` An idiomatic Clojure reduction over the same array of longs, just so we're measuring apples to apples, takes about 12 seconds--about 100x slower. ```
user=> (reduce + longs)
4999999950000000
user=> (quick-bench (reduce + longs))
Evaluation count : 6 in 6 samples of 1 calls.
Execution time mean : 12.037821 sec
Execution time std-deviation : 433.661044 ms
Execution time lower quantile : 11.790637 sec ( 2.5%)
Execution time upper quantile : 12.779262 sec (97.5%)
Overhead used : 20.384654 ns Found 1 outliers in 6 samples (16.6667 %)
low-severe 1 (16.6667 %)
Variance from outliers : 13.8889 % Variance is moderately inflated by outliers
``` Incidentally, this is one of the reasons I wrote `loopr` (https://aphyr.com/posts/360-loopr-a-loop-reduction-macro-for...). One of the iteration tactics it can compile to is iteration over arrays. Nowhere near Java--I think it's probably still boxing a fair bit, and I'd need to disassemble it to see--but it's still 10x faster than the standard reduce here. ~10x slower than Java. ```
user=> (quick-bench (loopr [sum 0] [x ^"[J" longs :via :array] (recur (+ sum x))))
Evaluation count : 6 in 6 samples of 1 calls.
Execution time mean : 1.259095 sec
Execution time std-deviation : 3.876650 ms
Execution time lower quantile : 1.256692 sec ( 2.5%)
Execution time upper quantile : 1.265747 sec (97.5%)
Overhead used : 20.321072 ns Found 1 outliers in 6 samples (16.6667 %)
low-severe 1 (16.6667 %)
Variance from outliers : 13.8889 % Variance is moderately inflated by outliers
``` |
I read a little more about this before you replied, but it seems like we're doing something different or there's a JVM difference that is bigger than expected.
- The clojure function takes me less than a second.
- (reduce + (long-array 100000000)) also takes less than a second.
- From what I understand, the compiler may have a special rule around using reduce and + that uses unboxed numbers by default
- Isn't it a big "if" that you might have a big array of primitive longs in Clojure? I can see how if you've already gone through a whole range of numbers and unboxed them then the java function would be fast. But it seems like it'd be easier to use + (again, this is running a lot faster on my machine for whatever reason).
I'm new to all of this so just being academic and trying to figure out what's going on