|
|
|
|
|
by Fiahil
1691 days ago
|
|
Thanks for the clarification ! :) Not an advice, but you should probably consider spinning a secondary product from DuckDB with a sole focus on "reading data from parquet files and running aggregations the most efficiently possible". You can probably skip INSERT, UPDATE, DELETE completely. There is currently a gap in practical solutions for this pain point. You can use Spark or Airflow, but nothing that comes without a big infra price tag (you can do that with pandas, but you need a large instance to load the entire dataset in memory). I think the right product could even outpace what you currently have with DuckDB. |
|