Spark is written in scala and Scala is its first-class language - other languages suffer from either second-class APIs (Java) or suffer from codec/serde overhead (pyspark) (though pyspark actually also is missing a few APIs that scala has, as well).
FWIW, I do believe there is a serious case to be made for Haskell… But it’s probably beyond the scope of this context / would require changing many other decisions.
If integrating with java tools was important then personally I’d ask “why not Clojure”.