|
|
|
|
|
by meitham
1124 days ago
|
|
I have a large number of small and frequent batches, think of it like discrete ETL, where each process operates on a pandas DataFrame. This frame ends up being written to disc as parquet and immediately followed by creating a DuckDB that imports the parquet. The duckdb file from then on will only be opened for read, no further writes. I use a python odata library to convert user queries in rest to a SQL similar to Postgres and run it on these duckdb for applying any filters where needed. |
|