|
|
|
|
|
by martinsmit
1162 days ago
|
|
I think DF.jl works remarkably well as a "fits in RAM" dataframe backend, but I think it just lacks in usability and integratedness with the wider Julia ecosystem. Or rather, the wider ecosystem isn't as mature in key data analysis areas. In particular, as you mention, plotting is one of the evolving parts of the ecosystem. Plots.jl is fine, Makie is powerful but very DIY, and AoG is slick but unwieldy. ggplot2 is far from perfect, but it works so well due to its maturity and integration with the rest of the Tidyverse. In my ideal world, there would be a DataFrames.jl wrapper to provide nice (not just nicer like the two DFM.jl packages) syntax, and a powerful high-level plotting package (Makie is powerful but syntax is low level, Plots is mid on both) which is heavily integrated with the wrapper package. Admittedly, I'm not a data scientist (anymore) so I don't follow the new developments in the dataviz scene much. If something like this exists then I would love to find it. I wonder what my ideal syntax would look like anyway. Maybe something close to Tidyverse but with symbols as column names `:col_name` for ambiguity reasons. |
|
If people are curious, DataFrames.jl has done a fair bit of work in consolidating a list of other packages that complement DataFrames.jl:
https://dataframes.juliadata.org/latest/#DataFrames.jl-and-t...
I do have to say that DataFrames.jl itself is a pretty impressive bit of work, and Bogumił Kamiński deserves no small amount of credit. If Julia ends up stealing significant mindshare from R / Python data science communities it'll be because of of this work.
Also for the Julia curious, I'll say that I find the ergonomics and sensibilities of the DataFrames.jl ecosystem much more in line with R and the Tidyverse than Python/Pandas.