Hacker News new | ask | show | jobs
by wildanimal 5404 days ago
I entirely agree though my personal preference is for lattice over ggplot as it is faster and more flexible at the present time (the "limitations" of ggplot reflect the preference of the creator; e.g. not being able to have two axes for the same plot). But reshape and also plyr are quite useful.

R does have surprising capability for shell scripting and text processing, albeit slower than Python. I also use Python for the rapid text-processing necessary (possibly populating an SQL database) for R to eventually use.

I've been keeping an eye on SciPy, but there still seems to be a lot of "the source code is the documentation", whereas in R the documentation is usually superb and well-structured. And Matplotlib, while beautiful, seems to be more verbose than Matlab or R when it comes to customizing details of the graphics (e.g., axes, etc.). That's just my impression, but I wouldn't mind being shown otherwise.

1 comments

I've built the equivalent to most of the plyr and reshape/reshape2 packages inside pandas (http://pandas.sourceforge.net, note I am in the midst of overhauling the documentation for the upcoming release). I plan to write a decent amount of side-by-side code comparisons, should definitely be useful for folks with R experience wishing to use Python for data analysis / statistics. Feedback from savvier R users than myself on pandas would also be extremely helpful.

Building a plotting library with the ease, sophistication, and beauty of ggplot2 in Python would be a big deal. A number of people I know are interested in venturing down that path (ggpy, anyone?).

That sounds great -- I only recently learned of pandas (I normally don't use much of NumPy/SciPy), but I will keep an eye out on this.
Thanks for your work on pandas, I'm looking forward to being able to stick with one language for most of my data analysis tool chain.
Yes, a python version of ggplot2 is well overdue!