Hacker News new | ask | show | jobs
by otter-in-a-suit 634 days ago
Author here. This decision went through all proper architecture channels, including talks with our engineers, proof of concepts and the like.

I’ve been doing this too long to shoehorn in my pet languages if I didn’t think they’re a good fit. And I think that scala/FP + Flink _is_ a good fit for this use case.

We did also explore the go ecosystem fwiw - the options there are limited (especially around the data tooling like iceberg) and go is simply not a language that’s popular enough in the data world.

Python’s typing system (or lack thereof) is a huge hinderance in this space in general (imo), and Java didn’t cause many happy faces on the Eng team either, but it’s certainly an option. I just find FP semantics a better fit for data / streaming work (lots of map and flat map anyways), and Scala makes that easy.

Also no cats/zio - just some tangles final _inspired_ composition and type classes. Not too difficult to reason about, not using any obscure patterns. I even mutate references sometimes. :-)

3 comments

I'm assuming the parent commenter hasn't worked in data/spark before either. The functional rabbit hole goes WAY deeper than even just cats et al, and Scala and spark themselves both encourage a fair amount of functional-style code on their own.
Could you speak to how you're interfacing scala with flink? I looked into using scala with flink a while back, and stopped when I found out that the scala API was deprecated.
scala? why not haskell instead?
Spark is written in scala and Scala is its first-class language - other languages suffer from either second-class APIs (Java) or suffer from codec/serde overhead (pyspark) (though pyspark actually also is missing a few APIs that scala has, as well).
Not assuming you’re serious, but in any case: the reason is the JVM (+ Scala) ecosystem in the data space.
FWIW, I do believe there is a serious case to be made for Haskell… But it’s probably beyond the scope of this context / would require changing many other decisions.

If integrating with java tools was important then personally I’d ask “why not Clojure”.

:)