Hacker News new | ask | show | jobs
by jrumbut 1238 days ago
ClickHouse is one of those technologies I've had half an eye on for a while.

In your completely unbiased opinion (I kid), do you think it's a good choice for the following problem?

I have multiple sensors that read different types of data about the same subject at (annoyingly) slightly different intervals, usually a few dozen times a second. This needs to be combined with other event data that happens on the order of a few times per day.

Currently I analyze this data in Python, R, and sometimes SAS (a weird proprietary language). Some coworkers use Matlab.

Is that a ClickHouse problem? If I tried it out would the ClickHouse community be interested in hearing how it goes?

1 comments

Looks like a good scenario for ClickHouse.

One option is to just record all the measurements with the corresponding time. Something like a table with:

  sensor_id, time, value
To align and correlate the measurements, simply round down the time to some bucket. Do something like

  SELECT toStartOfMinute(time) AS t, anyIf(value, sensor_id = 'X'), anyIf(value, sensor_id = 'Y')
  FROM measurements
  WHERE sensor_id IN ('X', 'Y')
  GROUP BY t
  ORDER BY t
Yes, I'm interested in, how it will go! milovidov@clickhouse.com
Another interesting option for correlation of measurements at uneven intervals is - using ASOF JOIN.