| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by LeCompteSftware 60 days ago

Naively this is quite surprising, but the devil is in the details. With the exception of Lean I'd point out they're all fairly close: Chez being 2.5x slower than C++ is not ignorable but it's also quite good for a dynamically-typed JITted language[1]. And I'm not surprised that F# does so well at this particular task. Without looking into it more closely, this seems to be a story about F# on .NET Core having the most mature and painless out-of-the-box parallelism of these languages. I assume this is elapsed time, it would be interesting to see a breakdown of CPU time.

I don't think these results are quite comparable because of slightly differing parallelism strategies; I'd expect the F# implementation of just spinning off threads to be more a little more performant than a Rayon parallel iterator, which presumably has some overhead. But that really just shows how hard it is to do a cross-language comparison; Rust and C++ can certainly be made faster than the F# code by carefully manipulating a ton of low-level OS concurrency primitives. This would arguably also be little misleading. Likewise Chez and Haskell have good C FFI; does that count? It's a tricky and highly qualitative analysis.

[1] FYI, one possible performance improvement with the Chez code is keeping the permutations in fxvectors and replace math operations with the fixnum-specific equivalent - this tells the compiler/interpreter that the data are guaranteed to be machine integers rather than bigints, so they aren't boxed/unboxed. I am not sure without running it myself, but there seems to be avoidable allocations in the Chez implementation. https://cisco.github.io/ChezScheme/csug/objects.html#./objec...

2 comments

Syzygies 60 days ago

Thank you. I will try your Chez idea. I love Chez, even if coding in Scheme can feel like rubbing sticks together to start a fire on an island, when e.g. Scala has induction ranges. And I didn't try Idris or Racket as they compile to Chez, but perhaps they do so better than I did.

As for parallelism this is a primary concern of mine, and I tried multiple approaches for every language where there was a choice. I used my own work-stealing code only when it beat standard libraries. AI warned me I was in over my head, that writing such a library takes years of experience, but my use case (and my expected use cases in my research) is so uniform that simple can win, minimally touching the required bases such as permuting tasks to avoid false sharing.

I don't believe that the JIT languages (F# on top) do so well because of better parallelism. This is branch optimization. For this use case an AOT compiler with ample benchmark data to influence output should do better. That isn't a thing, and the argument seems to be that few use cases stay consistent. A JIT can adapt.

link

Syzygies 59 days ago

Yes, Chez improved a bit, at the expensive of readability.

link

LeCompteSftware 59 days ago

Yeah :/ For a larger program you can pay the readability toll once, via a syntactic form that expands the general vector/arithmetic operations to the fixnum versions, e.g. used something like

  (define (heap-permute! perm j callback)
    (with-context 'fixnum ;; same trick works with 'flonum for 64-bit floats
      (let ([n (length perm)]) ;; actually fxvector-length
       (let generate ([k (- n 1)]) ;; actually fx-
         (if (< k j) ;; fx<
           (callback perm)
           (begin
              (generate (- k 1)) ;; fx-
              (do ([i j (+ i 1)]) ;; fx+
                ((>= i k)) ;; fx>=
                (if (even? (- j k)) ;; fxeven?, fx-
                (swap perm j k)
                (swap perm i k))
                (generate (- k 1)))))))) ;; fx-

Sorry if I borked the indentation. I have been working on stuff like this, and more general macros around dependency injection and inversion of control (e.g. you could write this macro to take the type as a parameter and generate code optimized for 'bigint or 'rational). Maybe check back after the summer :)

And BTW I misspoke earlier, of course Chez is AOT rather than JIT. From one approach it's sort of a hybrid: really fast on-the-fly AOT kinda looks like JIT, tongue-in-cheek you could say "NoT compilation" (nick-of-time). But proper JIT of course has huge advantages. If you reeaaaallly wanted to sabotage readability, Chez makes it easy to invoke the compiler at runtime, so along with the C FFI I think you could hack together some sort of JIT. But wow, what a mess that would be! You'd better be getting a PhD thesis out of it :) And if the performance is that critical you'd be much better off with F#.

link