I don't think this Haskell implementation is heavily optimised per se. The only places where readability is sacrificed for performance is the use of INLINE and the custom tail-recursive function, which is not where most of the performance improvement comes from. The other improvements (an efficient RNG, an efficient vector representation, and extract strictness) are something you get by default in Rust.
I would say that if an experienced Haskell programmer set out to write this program with the intent of making it fast, they'd never use lists in the first place, and not consider their absence an optimisation. Haskell lists are designed for lazily-generated arbitrary-length sequences, and are a bad fit for representing small finite-length sequences, such as vectors.
I would say that if an experienced Haskell programmer set out to write this program with the intent of making it fast, they'd never use lists in the first place, and not consider their absence an optimisation. Haskell lists are designed for lazily-generated arbitrary-length sequences, and are a bad fit for representing small finite-length sequences, such as vectors.