Hacker News new | ask | show | jobs
by peatmoss 1167 days ago
I'd agree with all of this. It's not so much the presence of ecosystem, it's the maturity of the ecosystem and knowing how to navigate it. Lots of competing approaches in a relatively small community makes me a little nervous.

If people are curious, DataFrames.jl has done a fair bit of work in consolidating a list of other packages that complement DataFrames.jl:

https://dataframes.juliadata.org/latest/#DataFrames.jl-and-t...

I do have to say that DataFrames.jl itself is a pretty impressive bit of work, and Bogumił Kamiński deserves no small amount of credit. If Julia ends up stealing significant mindshare from R / Python data science communities it'll be because of of this work.

Also for the Julia curious, I'll say that I find the ergonomics and sensibilities of the DataFrames.jl ecosystem much more in line with R and the Tidyverse than Python/Pandas.

1 comments

Bogumil is a truly outstanding member of the community and DataFrames.jl is an impressive, versatile package.

From my perspective, however, DataFrames.jl's power is what makes it quite unergonomic for me. As an example, take the `args => transformations => result` syntax for doing pretty much anything in DataFrames. It versatile, but the lack of rank polymorphism in Julia i.e. broadcasting/mapping has to be explicit (which is usually a good thing given that type polymorphism is Julia's whole schtick) means that the transformation syntax feels cumbersome.

It's not that I want everything rowwise by default, an option provided by DataFramesMacros.jl, it's that I want things to be rank polymorphic when it makes sense. Base R got this right, hell S got this right, and so the Tidyverse inherited it and it makes the package so much more ergonomic than it would otherwise be.

I cannot overstate how impressive DataFrames.jl is, but I have to caveat this with "but I really try to avoid using it if possible". It's a shame, but I just think R's laissez-faire hackability, which in many cases results in spaghetti code, works really well in the tabular programming world where ergonomics are king and performance is easy.