Hacker News new | ask | show | jobs
by sciurus 2826 days ago
Congrats on the RC! TimescaleDB is a very neat idea, there's a lot to gain by building on Postgres.

There's also a very serious limitation due to that: the requirement to predefine schemas. My primary use case for a timeseries-focused db is storing system and application metrics. Using a commercial (e.g. datadog, signalfx) or open source (e.g. influxdb, prometheus) timeseries product I can submit arbitrary data. If I had to perform a schema migration every time a developer wanted to record a new metric, it would be extremely painful.

If this has changed since I last looked at TimescaleDB, please correct me!

3 comments

This is not the case.

TimescaleDB has full support for storing JSON data (inherited from postgres), including with indexes, so you do not need to fully pre-define your schema for these types of applications.

In fact, we added support for TimescaleDB to be a read/write backend for long-term Prometheus metrics. You pull from Prometheus via its remote storage backend, and the data appears automatically in TimescaleDB. But then unlike Influx and native Prometheus, you get to JOIN it against additional metadata for richer questions. For more information:

https://blog.timescale.com/sql-nosql-data-storage-for-promet...

https://github.com/timescale/pg_prometheus

Cool! It looks like the metrics view you build on top of the values and labels table makes it _reasonably_ easy to query. I still worry about how you get good autocomplete in Grafana, though.

Supporting receiving data from Prometheus is nice, but for people who aren't already invested in that it would be helpful if you either

a) picked an agent (e.g. Telegraf or Collectd) and taught it how to submit to TimescaleDB directly

b) picked a protocol already commonly used by agents (e.g. graphite plaintext protocol) and taught TimescaleDB to receive it

In addition to JSON support we also support EAV type schemas which look like:

- time | metric_name | value

- or time | metric_id | value

- or time | tags_id | value where the tags table contain normalized tags and metric names

This is my understanding too. Also one of the reasons we ruled it out when we were evaluating other in-house TSDB options.