|
|
|
|
|
by hodgesrm
2354 days ago
|
|
The important difference is that we used a more realistic temperature profile, which as you say does affect compression for that column. Schema design (including sort order, compression, and codecs) for the remaining columns is just good ClickHouse practice. Much of the storage and I/O savings is in the date, time, and sensor_id and columns. It's also useful to note that the materialized view results would be essentially the same no matter how you generate and store data because the materialized view down-samples temperature max/min to daily aggregates. The data are vastly smaller no matter how you generate them. The article illustrates that if you really had such an IoT app and designed it properly you could run analytics with surprisingly few resources. I think that's a significant point. |
|
I get the desire to critique the temperature profile, but completely changing it makes the comparison worthless. From a data perspective it's like saying "if all the sensors just report 1 for temperature every reading, computing the min, max, and average is super fast". No shit, that wasn't the task though.