|
|
|
|
|
by gwenzek
2279 days ago
|
|
Interesting approach.
I'm currently not satisfied by Pandas which seems to be the defacto tool for processing tables.
But I find the query API really unnatural especially for filtering. Do you have some benchmark for performances ? Is this more aimed at playing around in a notebook or used inside a full data processing pipeline? |
|
However this was not the reason why I needed to build convtools, I needed to process reports, touching only some columns (without failing if an unrelated column is no longer processable). So I needed to reuse and combine python expressions across multiple procedures.
There are no benchmarks at the moment, you can just pass debug=True to the gen_converter method to see the generated code and judge whether it's optimal for your use case. This is a python library which generates simple python code: - without unnecessary conditions and loops - without keeping all items of iterable in memory to aggregate (it leverages reducers) - making no use of C-extensions.