|
|
|
|
|
by ritchie46
1260 days ago
|
|
Hi Ian ;), It depends on what let determine the order. Hiring experience and available content, I wholeheartedly agree with your list. But if we order by performance/memory efficiency, A single threaded, (eager), library simply will be no comparison and should not top that list. In every TPCH query we ran, polars is orders of magnitudes faster than pandas. https://www.pola.rs/benchmarks.html Interopability with legacy systems should not be a concern. Polars is backed by arrow memory and arrow is becoming the default data transformation layer. Other than that, you can easily convert to pandas or numpy. That single copy is often no comparison with the time lost in a pandas join. Polars and pandas can work hand in hand, you don't have to fully replace one. It is 2023, polars is used in production and is here to stay. IMO it should seriously be considered if performance and consistency is important to you. |
|