Hacker News new | ask | show | jobs
InfluxDB has taken its open-source business to Silicon Valley (technical.ly)
46 points by rhoml 4102 days ago
8 comments

I really want to love influxdb because I think the world needs a better answer to time series databases that doesn't include java (OpenTSDB, Cassandra). The underlying storage engine (leveldb/rocksdb) is quite solid. I'm currently running 3 nodes in production (for collecting stats) and doing a few thousand writes/s. I'm not using any of the clustering features, I probably won't even evaluate that until 0.9.

I'm currently running the latest 0.8.x release and there are a few issues:

1.My influxdb instances stop servicing reads once every 12 hours so I have a cron job that force restarts it. https://github.com/influxdb/influxdb/issues/1116

2. Enabling the graphite plugin on the first run can crash the process (the creation of the default cluster admin user seems to be racy). Not a big deal except in automated deployment scenarios.

3. I lost an entire database (luckily it was just used for storing grafana graph definitions and not actual data).

4. I'm not sure if anyone's currently working on their admin UI. I submitted a pull request to their admin UI to sort shards by ID because currently it randomizes the order on every load (I presume because of golang's randomized map iteration). It's sat there since January. The last PR they merged into that repo was in May of 2014.

I really want influxdb to be successful. Every organization I've worked for in the last few years has serious graphite scaling issues and influxdb is well positioned to fix those. I think even in it's current state it's a better option than graphite (and the influxdb-graphite plugin gives you all the graphite features).

Hi Shane, thanks for the encouragement and sorry you're having a few problems with the current 0.8.8 release.

We're heads down working on 0.9.0 and won't be doing any more releases in the 0.8.x line (except to create a migration path to 0.9.0). So we are merging PRs, but only those that apply to 0.9.0 (which includes the admin UI).

Another InfluxDB user here. I'd done some evaluations with OpenTSDB and the Graphite suite, and while I had some concerns with stability and maturity the main things that sold me on it were:

* No dependencies. Compare this with setting up HDFS/HBase and Graphite which is a real pain in the neck to manage, especially since my tsdb has to run on an arbitrary machine pool in a sandbox.

* Active development. This is a big one. Releases have been coming steadily and Paul & co. do a good job of having a real roadmap and chipping away at it; this is probably my tipping point over Graphite.

* Clustering. Maybe it's not there yet, but see above. Most tools in this space are not elastic at all.

* Grafana integration - seems like there is a good bit of momentum in that project in general which is promising.

PS Reading this it almost sounds like an ad, no I'm not affiliated with influx.

PPS logfile configuration for rotation/cleanup would be a nice-to-have enhancement ;)

Thanks, we're working hard on getting the clustering features complete so we have a real answer for HA, failover, and scalability (up to a point based on current design).

For logfile rotation our recommended solution is to use logrotate. We'll be updating the install to include a config. See https://github.com/influxdb/influxdb/issues/1943

Incidently I wrote a blog on it last week: http://grisha.org/blog/2015/03/20/influxdb-data/

The site says "production ready in March" - it seemed to me like there's at least 3 months of work there given that most of the clustering features (e.g. how to rebuild a fialed node, how to expand the cluster, distributed queries) are not there.

My other concern with InfluxDB is that it doesn't follow the fate of FoundationDB - get acquired by a giant corporation and disappear.

Hi Grisha, I saw that post, thanks for writing it! The coming features you're talking about are the work we're focused on for finishing this release. The three you mention should drop in an RC within two weeks.

The distributed queries part isn't a large amount of work beccause of how we've designed things. Under the covers the query engine already represents each query as a MapReduce job to be run.

For cluster expansion, work is starting on that today. Again it's just a matter of wiring some things up. Node replacement is also starting today.

We may miss the March goal but it won't be by anything close to 3 months. Glad you're paying attention to the project though :)

For the Foundation problem, I thought they were never open source. Just free for 5 nodes or less, no?

I think the key to avoiding this fate is to build an active community of contributors outside the company. Luckily we have people submitting PRs every week. We'll be trying to document more of the code and make it easier for outsiders to get involved as we go along.

That way if the worst happens, at least the community can fork and keep the project going forward. I'd love nothing more than for Influx to become bigger than this company.

Thanks Paul! So you're saying it's all a SMOP :)

Another thing that I think might be a critical (or at least interesting) characteristic is back-filling optimization, i.e. when you need to load a trillion data points of historical data - this y/t explains it pretty well and talks about how OpenTSDB addresses it: https://www.youtube.com/watch?v=SgD3RD2Shg4

Anyhow - keep up the good work, I very much believe that in the next couple of years "Time Series" is going to become a resume-must-include buzzword :)

Cool, I'll have to take a look at that talk. We've had people ask about backfilling large amounts of data so it's something we'll have to figure out.
Another thing I was curious about is why not do all the clustering/distributed stuff at the db level, i.e. have some sort of a distributed BoltDB-like/Raft as a separate layer or even entirely separate project, and then InfluxDB would be a much thinner/simpler thing. I think that in general the approach of OpenTSDB and similar things is right, it's just that HBase/Hadoop is a such a pain to set up and maintain (and so is Cassandra, if perhaps a little less).
One of the key goals of the project is to be able to aggregate and downsample from raw high precision data. That means we want a framework in which we ship the code to where the data lives, not the other way around.

The abstractions I've seen that have the database layer and then some services on top all miss this. They transport all of the raw data over the network and then run the computations and return the summary ticks back to the user.

Our framework lets us compute the summary ticks locally and send only those back (is many cases, but not all).

I am using InfluxDB in my research, to analyze resource utilizations of running applications and it has been very useful to me since, but I think it was supposed to be production ready this March. There are some bugs that occurs sometimes.
We're busy at work on the production ready version. We're targeting March, but we won't release until it's ready (even if that means slipping our target).

Remember, in software development there are lies, damn lies, and delivery estimates.

We'll get it out as quickly as possible, sorry for any delays.

can you say something about the upgrade path, will that be possible to do live?
We haven't built the migration tool yet, but mostly likely it will involve running a new version in parallel with the old version while the upgrade runs.

For a guide on how to design your schema for a clean migration see here: http://influxdb.com/docs/v0.8/advanced_topics/schema_design....

Great product and smart pivot, I hope they do well. Them blowing up would be another win for the NYC tech scene (indirectly).
Thanks Eric!
Denver office? Sounds like someone recruited a few Pivotal Labs alumni :)
We have, but sadly not in Denver... yet ;)
Any comparisons with KairosDB or OpenTSDB?
KairosDB is less popular, smaller, but with different limitations and is more flexible. It was inspired by the OpenTSDB design but then took a different path.

OpenTSDB relies on HBase, KairosDB has configurable and pluggable datastore - but the only production-ready so far is using Cassandra.

OpenTSDB always does interpolation of values for aggregation (which I found to be an hazardous decision), KairosDB does not really do proper series "vertical" aggregation (by vertical I mean not downsampling).

OpenTSDB is GPL, KairosDB is Apache 2.0 (that counts for closed-source integrations).

OpenTSDB supports only numerical data but supports annotations, KairosDB supports Strings and numerical in baseline but is compatible with any data type, it does not have annotations (but you may use string for that).

On their baseline OpenTSDB produces graphs on the server, KairosDB produces graphs on the client.

Both are integrated with Grafana time series dashboard, OpenTSDB has more side projects, KairosDB is the only time series database I know for being integrated with a reporting tool (BIRT).

OpenTSDB requires to create the metrics in advance using a special tool (needs to lock the cluster to allocate a new ID), KairosDB can have any kind of new metric on the fly.

If you need something modular for building custom features I strongly recommend KairosDB. Look at the code, it's really nicely crafted.

Otherwise, they both have goods, I found that KairosDB is also much less limited on the cornersides (while having less side-projects), and we now use kairosDB intensively.

Congrats Paul!

Mike (BMark Admin)

thanks Mike!