| HN Mirror

I used 1.x for my push-monitoring stack at my last job. (For cases where "pull" is practical, I would always use Prometheus. Prometheus also has "push" now, by the way.) They went into 2.0 mode and kind of neglected 1.x, and I kind of forgot about it. At the time, I was most familiar with an internal monitoring system at Google, and I found I couldn't do queries that I expected to be able to do. I even mentioned it on HN and some influx folks told me that what I wanted to do was too weird to support. (It's not. I was collecting byte counters from fiber CPEs, and wanted to have bandwidth charts based on topology tags I stored with the data -- imagine a SQL table like (serial_number text not null, time timestamp not null, locality text not null, bytes_sent int64 not null, bytes_received int64 not null). The problem was that timestamps would not be aligned between records in the same locality group -- I sampled these occasionally throughout the day and not all at the same instant. And, they were counters, not deltas, so the query would have to do the delta across each serial number, and then aggregate across all devices in a locality. Very possible to do, I literally had that chart with the other monitoring system. But not possible with the influx v1 querying, as far as I could tell.)

I set up 2.x for myself recently, and they have really done a lot of work. The OSS offering has most of the features that cloud/enterprise would. It was easy to set up -- they don't have any instructions for installing it in Kubernetes, and haven't updated their Helm charts for 2.x, but it was like 3 minutes to write a manifest (https://github.com/jrockway/jrock.us/tree/master/production/...) myself, which I prefer 99.9% of the time anyway. The new query language is incredibly verbose, but I see the steps that I remember having with Google's internal system, align, delta, aggregate... all possible. (I had to scratch my head a lot, though, to make it work. And I really am not able to reason about what operations it's doing, what's indexed or not indexed, why I ingest my data as rows but process it as columns, etc.) The performance is good, and it worked well for my use case of pushing data from my Intranet of Stuff. Generally I like it and I don't think they are being shady in any way. It's on my list of something to set up at work to collect various pieces of time series data outside of the Prometheus ecosystem (CI runtimes, etc.).

The reason I picked InfluxDB over TimescaleDB for my personal stuff is because InfluxDB has an HTTP API with built-in authentication. I already a ton of HTTP services exposed to the Internet, and I understand them well. (Yup, I have SSO and rate limiting and all that stuff for my personal projects ;) I can give each of my devices an API key from their web interface, and I make an HTTP request to write data. Very simple. (They have a client library, but honestly my main target is a Beaglebone, and it doesn't have enough memory to compile their client library. I've never seen "go build" run out of memory, but their client makes that happen. I shouldn't develop on my IoT device, of course, but it's just easier because it has Emacs and gopls, and all the sensors connected to the right bus. Was easier to just manually make the API calls than to cross-compile on my workstation and push the release build to the actual device.) TimescaleDB doesn't have that, because it's just Postgres. So I'd basically have to expose port 5432 to the world, create Postgres users for every device, generate a password, store that somewhere, etc. Then to ingest data, I'd connect to the database, tune my connection pool, retry failed requests manually, etc. Using HTTP gets me all that for free; I can just configure retries in Envoy.

But... SQL queries are a lot easier to figure out than FluxQL queries, and I already have good tools for manipulating raw data in Postgres (DataGrip is my preferred method), so I think I will likely be revisiting TimescaleDB. Honestly, I'd pay for a managed offering right now if they had a button in Google Cloud Console that was "Create Instance and by the way this just gets added to your GCP bill for 10% more than a normal Cloud SQL instance".