|
|
|
|
|
by throwaway894345
1941 days ago
|
|
In my opinion it's this sort of short-sighted thinking that has cursed the Python project. "Everyone uses CPython" leads to "let's just let third party packages depend on any part of CPython" which leads to "Can't optimize CPython because it might break a dependency" which leads to "CPython is too slow, the ecosystem needs to invest heavily in c-extensions [including numpy]" which leads to "Can't create alternate Python implementations because the ecosystem depends concretely on CPython"[^1] and probably also the mess that is Python package management. I'm not sure that the Numpy/Pandas hegemony over Python scientific computing will last. Eventually the ecosystem might move toward Arrow or something else. In this case it's probably not such a big deal because Arrow's mainstream debut will probably predate any serious adoption of Cython, but if it didn't then the latter would effectively preclude the former--Arrow becomes infeasible because everyone is using Cython/Numpy and Cython/Arrow performance is too poor to make the move, and since no one is making the move it's not worth investing in an Arrow special case in Cython and now no one gets the benefits that Arrow confers over Numpy/Pandas. [^1]: Yes, Pypy exists and its maintainers have done yeoman's work in striving for compatibility with the ecosystem, and still (last I checked) you couldn't do such exotic things as "talking to a Postgres database via a production-ready (read: 'maintained, performant, secure, tested, stable, etc') package". |
|
Arrow is a low-level implementation detail, like BLAS. "Using" Arrow in data science in Python would mean implementing an Arrow-backed Pandas (or Pandas-like) DataFrame.
Your rank-and-file data scientist doesn't even know that Arrow exists, let alone that you can theoretically implement arrays, matrices, and data frames backed by it.
If you want to break the hegemony of Numpy, you will have to reimplement Numpy using CFFI instead of the CPython C API. There is no other way, unless you get everyone to switch to Julia.