|
|
|
|
|
by walshemj
3392 days ago
|
|
can I ask why functional programming in particular I can see why you might want to avoid java for big data - but isn't the average ML algo more in the procedural mould? Would not python with numpy be a better fit ? or fortran with some handwave interface code Back (early 80;s) when I did map reduce we used PL1/G |
|
From a business standpoint though, there are a few main reasons:
–Data pipelines are well modeled as functions: they take a few input datasets, return a few outputs at the end, and do a ton of processing in between
–FP idioms generally make parallelization easier, and this is very important for the datasets we're dealing with
–A strong type system like Scala's lets us prevent many runtime errors, which is quite important when your pipelines can take several hours
–It's fairly trivial to wrap a statistical/ML algorithm in a pure functional interface, even if the algorithm itself is imperative