Hacker News new | ask | show | jobs
by zX41ZdbW 1219 days ago
I work in ClickHouse, and I do competitive research. Here are my observations:

BigQuery bills over uncompressed data ($20/TB/month), and with the typical compression ratio, it is very expensive. The processing is also very expensive ($5/TB of uncompressed data processed).

Every query in BigQuery takes around a second minimum. Subsecond latency cannot be achieved in most cases. However, there is some feature for limited scenarios.

It is impossible to insert new data in real-time, and process range queries quickly. Range queries require pre-sorted data to run quickly. No "merge tree" magic like in ClickHouse. This makes real-time analytics and user-facing queries impossible with BigQuery, unlike ClickHouse. You will end up creating daily, hourly, and minute tables in BigQuery, while in ClickHouse, all the data will reside in one big table with no hassle.

BigQuery is not as versatile as ClickHouse. It is just an SQL engine, and you won't find many practical features for web analytics, financial data, sensor data, or APM... as in ClickHouse.

You cannot set it up on your infrastructure if needed.

Advantages of BigQuery:

Almost no settings.

Queries scale automatically, and the number of workers is selected for every query as needed.

Fairly good for long-running queries.

1 comments

For a user perspective, you can also check out this recent blog: https://clickhouse.com/blog/hifis-migration-from-bigquery-to...

TL;DR * HiFi moved because BQ pricing and query latency did not fit the needs for their customer-facing app (music royalty data analytics) * Caveat when moving to ClickHouse: they had to adjust how they handle JOINs

Disclaimer: I also work at ClickHouse.

> a single HIFI Enterprise account can easily have half a gigabyte of associated royalty data representing over 25 million rows of streaming and other transaction data

Curious about why they were struggling with this workload on BigQuery. It just doesn’t seem like very much data. Maybe they were cost-constrained and using a tiny slot reservation?