|
|
|
|
|
by wesm
2149 days ago
|
|
hi, Wes (Apache Arrow co-creator and Python pandas creator) here! If you're wondering what this project is all about, my JupyterCon keynote (18 min long) from 3 years ago is a good summary and the vision / scope for what we've been doing since 2016 has been pretty consistent https://www.youtube.com/watch?v=wdmf1msbtVs |
|
I'm a big fan of Pandas, and didn't know about Arrow. I've been considering do a talk advocating for a consistent data-frame api across languages since IMHO, it's the next fundamental data structure that should have baked in support everywhere. So it appears you've at least somewhat beaten me to the punch.
Since Arrow is more than an API to tabular data structures, what would you think about a Promises/A+-like specification for dataframes?
How much of the Arrow API do you think end users will wind up using, as opposed to being a lower-level framework that projects like pandas and dplyr wind up using behind the scenes?
Finally, do you think that Arrow has the potential to be the logical successor to pandas? If not, what is your long term strategy to address the shortcomings that you see in pandas?