Hacker News new | ask | show | jobs
by pauldix 2053 days ago
Timescale is built on top of Postgres, which is a row oriented database. They've built a kind of columnar layer on top of it, which is quite interesting. Because it's Postgres you get their full SQL support.

Meanwhile, InfluxDB IOx has a very different set of goals than Postgres. It's not an OLTP (transactional) DB and never will be. It's firmly targeted at OLAP and real-time OLAP workloads.

That means we can do things like optimize for running on ephemeral storage with object storage as the persistence layer. It'll have fine grained control over replication, how data is partitioned in a cluster, and where data is indexed, queried, queued for writes and more. Push and pull replication, bulk transfer, and persistence with Parquet. This last bit means you get integration with other data processing and data warehousing tools with minimal effort.

It'll also support Arrow Flight which will give it great integration into the data science ecosystems in Python and R.

Right now, InfluxDB IOx is really too early to do any real comparison on actual operation. We're putting this out now so that people can see what we're doing, comment on it, and maybe even contribute. We think it's an interesting approach where no single item is completely novel, but the composition of everything together makes it an entirely unique offering in open source.

Edit: one other thing I forgot to mention. InfluxDB IOx is open source, Timescale isn't. For some that matters, for many it doesn't. Depends on your use case.

1 comments

Can you elaborate on what you mean by TimescaleDB not being open source?

https://github.com/timescale/timescaledb

It's under a community license, which has restrictions. The limitations on derivative works and value added products or services are the ones that will create the most problem for people trying to build a business on it: https://www.timescale.com/legal/licenses

For users within large organizations, they're likely not able to use the software without approval from their legal department because it doesn't fall under any open source license.

Like I said, whether you care is really case dependent.

This is misinformation. Most of TimescaleDB is open source under Apache 2. The difference is that the advanced features of TimescaleDB - Eg clustering - are under a source available license and are free, while advanced InfluxDB features like clustering are under a paid enterprise license. In fact TimescaleDB recently made all of our enterprise features available for free. So one could argue that TimescaleDB is more open than Influx.

For more information on TimescaleDB licensing, check out this blog post: https://blog.timescale.com/blog/building-open-source-busines...

(Disclaimer/ fyi: I work at Timescale)

My post is about InfluxDB IOx, which is the project this thread is about. You're correct about InfluxDB having HA and clustering under a closed source enterprise license. If you read the post, I even mention this as a shortcoming of the project. One which we're hoping to rectify with InfluxDB IOx.

So some parts of Timescale are under actual Apache 2 and some parts are under a proprietary source available license. I'm not sure what the LOC of which is which, or how it's actually organized in your repo. I'll leave it up to your potential users to try to figure out which and disentangle what parts are actually open.

As I recall, AWS very publicly forked Elastic because of this very same type of confusion. The difference is that if AWS were going to fork your project, they'd just fork Postgres, which is the real open source software that you're benefitting from.

If I were building an developer focused analytics, monitoring, or data analysis product, I wouldn't do it on top of Timescale because some parts of your codebase most definitely prevent that through your license. But that's me.

> If I were building an developer focused analytics, monitoring, or data analysis product, I wouldn't do it on top of Timescale because some parts of your codebase most definitely prevent that through your license. But that's me.

That's also FUD, two ways.

First, what the Timescale License prevents is somebody offering our Community Edition as a standalone "TimescaleDB-as-a-Service", a la AWS bundling it as part of RDS, or Microsoft as part of Azure Postgres. There is a clean technical test for "DDL access to the database" by users in the license. It's not tricky. You can absolutely develop/sell/distribute/provide analytics, monitoring, or data analysis products on top of TimescaleDB Community Edition. Many companies do.

As to "hopelessly-entangled source", if you know what a directory is, you can tell the difference. There's a "/tsl" subdirectory with Timescale Licensed code. Everything else is Apache2. You can compile pure Apache-2 versions with a single compile flag, and we distribute Apache2 binaries. In fact, the Postgres community itself distributes Apache-2 binaries, and Microsoft, Digital Ocean, Rackspace, and other clouds make the Apache2 version available as part of the managed database offerings.

So if my users can't have DDL access, that means that they can't define the schemas for the analytics that they want to do? It only works if as the developer of an application I have a fixed schema that my users interact with?