| "time-series database" is some of the most overhyped nonsense since noSQL. Time-series is just data with time as a primary component. It comes in all shapes and volume, but if you have a lot of data and are running heavy OLAP queries than we already have an entire class of capable databases. Use any modern distributed relational column-oriented database, set primary key to metric id + timestamp, and you'll be able to scale easily with full SQL and joins. You can keep your other business data there too, along with JSON, geospatial, window functions, and all the other rich analytical queries available with relational databases. We have trillion row tables that work great. No special "TSDB" needed. |
As an example, in the industry I work in, you may have no readings for days or weeks, and then hundreds of readings from the same sensor. Why? Many systems in industrial environments send new readings only "on-change", and assume the underlying data storage architecture will forward fill to in-between times. This is why the practically ancient time series architecture of data historians still dominates in these use cases.
In fact, for many time series functions you actually have to throw away the notion of relational joins to be able to efficiently perform time-preserving joins. Window functions only work in basic use cases with relatively small amounts of data where you're aggregating.