|
|
|
|
|
by datanecdote
2075 days ago
|
|
> 2. I don't entirely follow this point. Perhaps using PyArrow's parser would be faster than what is timed here, but is that what the typical Python data science user would do? I am a Python data science user. If data gets big enough such that loading time is a bottleneck, I use parquet files instead of CSV, and PyArrow to load them into pandas. It’s a one line change. The creator of Pandas is now leading the Arrow project. It’s very seamless. Don’t know if I’m typical but that’s me. |
|
Jacob Quinn (karbacca) also has a Julia package for integrating Julia into the Arrow ecosystem: https://github.com/JuliaData/Arrow.jl