|
|
|
|
|
by ul5255
40 days ago
|
|
I have several years worth of timestamped sensor data in a SQlite DB that is 12+GB and growing. One idea was to experiment with DuckDB. Your comment about DuckDB has me worried as time ranged queries are common. Any link for me to dig deeper? |
|
You could certainly create a directory with a Parquet file for each (entity id, time range), and you could probably convince the DuckDB query engine to understand that (using Ducklake? raw Hive can only barely do this), but I don’t think that DuckDB will binary search for you. (And binary search is actually pretty lousy for this use case.)
Clickhouse has explicitly ordered tables:
https://clickhouse.com/docs/engines/table-engines/mergetree-...