|
|
|
|
|
by lmeyerov
1337 days ago
|
|
agreed, reading this article was confusing, the python baseline is far from our reality for reference, we're aiming for 1-100 GB / second, per server, in our python etl+ml+viz pipelines interestingly, duckdb+polars are nice for small non-etl/ml perf, but once it's analytical processing, we use cudf / dask_cudf for much more perf per watt / $. I'd love the low overhead & typing benefits of polars, but as soon as you start looking at GB+/s and occasional bigger-than-memory, the core sw+hw needs to change a bit, end-to-end (and if folks are into graph-based investigations, we're hiring backend/infra :) ) |
|