Hacker News new | ask | show | jobs
by 1996 2303 days ago
I cringe a bit inside at people using say nosql approaches when it makes literally no sense to do so.

Therefore I think the lack of OLTP will not matter much and that clickhouse will be widely used, but also misused when it becomes too fashionable.

2 comments

This makes no sense.

For example, aside from the lack of transactions, Clickhouse is designed for insertion. There's an INSERT statement, but no UPDATE or DELETE statements. You can rewrite tables (there's ALTER TABLE ... UPDATE and ALTER TABLE ... DELETE), but they're intended for large batch operations, and the operations potentially asynchronous, meaning that they complete right away, but you only see results later.

ClickHouse has many other limitations. For example, there's no enforcement of uniqueness: You can insert the same primary key multiple times. You can dedupe the data, but only specific table engines support this.

There's absolutely no way anyone will want to use ClickHouse as a general-purpose database.

I should have phrased that differently: if something is good enough in some key metric, it extends to other uses - even if it makes a poor fit.

So I insist: everyone will WANT to use clickhouse as a general purpose database, and will create ways to make it so (ex: copy table with the columns you don't want filtered out, drop the original, rename)

It is just too fast and too good for many other things, so it will expand from these strongholds to the rest.

A personal example: I am migrating my cold storage to clickhouse, because I can just copy the files in place and be up and running.

I know about insert and the likes, I have a great existing system - but this lets me simplify the design, and deprecate many things. Fewer moving parts is in general better.

After that is done, there is a database where I would benefit from things like alter tables or advanced joins, but keeping PostgreSQL and ClickHouse side by side, just for this? No. PostgreSQL will go. Dirty tricks will be deployed. Data will be duplicated if necessary.

Advanced joins (specifically merge joins) and object storage are on the way. See the following PRs:

* https://github.com/ClickHouse/ClickHouse/pulls?q=is%3Apr+mer... -- Recent work to enable merge joins

* https://github.com/ClickHouse/ClickHouse/pulls?q=is%3Apr+s3 -- Same thing for managing data on S3 compatible object storage

There's been a lot of community interest in both topics. Merge join work is largely driven by the ClickHouse team at Yandex. Object storage contributions are from a wider range of teams.

That said I don't see ClickHouse replacing OLTP databases any time soon. It's an analytic store and many of the design choices favor fast, resource efficient scanning and aggregation over large datasets. ClickHouse is not the right choice for high levels of concurrent users working on mutable point data. For this Redis, PostgreSQL, or MySQL are your friends.

Sure - but the comment you're replying to made no mention of NoSQL. It just said Clickhouse lacks OLTP by design, that doesn't mean it won't be widely used, just that it will perhaps be limited to analytics workloads.

If you need deletes and transactions, look elsewhere, but Clickhouse seems to be great for what it's been designed for.