|
You can mount read replicas on object storage in ClickHouse. Example: CREATE DATABASE test;
USE test;
CREATE TABLE hackernews_history UUID '66491946-56e3-4790-a112-d2dc3963e68a'
(
`update_time` DateTime DEFAULT now(),
`id` UInt32,
`deleted` UInt8,
`type` Enum8('story' = 1, 'comment' = 2, 'poll' = 3, 'pollopt' = 4, 'job' = 5),
`by` LowCardinality(String),
`time` DateTime,
`text` String,
`dead` UInt8,
`parent` UInt32,
`poll` UInt32,
`kids` Array(UInt32),
`url` String,
`score` Int32,
`title` String,
`parts` Array(UInt32),
`descendants` Int32
)
ENGINE = ReplacingMergeTree(update_time)
ORDER BY id
SETTINGS disk = disk(readonly = true, type = 's3_plain_rewritable', endpoint = 'https://clicklake-test-2.s3.eu-central-1.amazonaws.com/', use_environment_credentials = false);
And you can try it right now.Install ClickHouse: curl https://clickhouse.com/ | sh
./clickhouse local
Run the query above to attach the table.The table is updated in real time. For example, here is your comment: :) SELECT * FROM hackernews_history WHERE text LIKE '%Clickhouse is amazing%' ORDER BY update_time \G
Row 1:
──────
update_time: 2024-04-06 16:35:28
id: 39785472
deleted: 0
type: comment
by: mightybyte
time: 2024-03-21 22:59:20
text: I'll second this. Clickhouse is amazing. I was actually using it today to query some CSV files. I had to refresh my memory on the syntax so if anyone is interested:<p><pre><code> clickhouse local -q "SELECT foo, sum(bar) FROM file('foobar.csv', CSV) GROUP BY foo FORMAT Pretty"
</code></pre>
Way easier than opening in Excel and creating a pivot table which was my previous workflow.<p>Here's a list of the different input and output formats that it supports.<p><a href="https://clickhouse.com/docs/en/interfaces/formats" rel="nofollow">https://clickhouse.com/docs/en/interfaces/formats</a>
dead: 0
parent: 39784942
poll: 0
kids: [39788575]
url:
score: 0
title:
parts: []
descendants: 0
Row 2:
──────
update_time: 2024-04-06 18:07:34
id: 31334599
deleted: 0
type: comment
by: richieartoul
time: 2022-05-11 00:54:31
text: Not really. Clickhouse is amazing, but if you want to run it at massive scale you’ll have to invest a lot into sharding and clustering and all that. Druid is more distributed by default, but doesn’t support as sophisticated of queries as Clickhouse does.<p>Neither Clickhouse nor Druid can hold a candle to what Snowflake can do in terms of query capabilities, as well as the flexibility and richness of their product.<p>That’s just scratching the surface. They’re completely different product categories IMO, although they have a lot of technical / architectural overlap depending on how much you squint.<p>Devil is in the details basically.
dead: 0
parent: 31334527
poll: 0
kids: [31334736]
url:
score: 0
title:
parts: []
descendants: 0
Row 3:
──────
update_time: 2024-11-07 22:29:09
id: 42081672
deleted: 0
type: comment
by: maxmcd
time: 2024-11-07 22:13:12
text: Using duckdb and apache iceberg means that you can run read replicas without any operational burden. Clickhouse is amazing, but they do not allow you to mount dumb read replicas to object storage (yet).<p>I can imagine this product is a very elegant solution for many types of companies/teams/workloads.
dead: 0
parent: 42080385
poll: 0
kids: []
url:
score: 0
title:
parts: []
descendants: 0
3 rows in set. Elapsed: 3.981 sec. Processed 42.27 million rows, 14.45 GB (10.62 million rows/s., 3.63 GB/s.)
Peak memory usage: 579.26 MiB.
|
Query id: daa202a3-874c-4a68-9e3c-974560ba4624
Elapsed: 0.092 sec.
Received exception: Code: 499. DB::Exception: The AWS Access Key Id you provided does not exist in our records. (Code: 23, S3 exception: 'InvalidAccessKeyId'): While processing disk(readonly = true, type = 's3_plain_rewritable', endpoint = 'https://clicklake-test-2.s3.eu-central-1.amazonaws.com/', use_environment_credentials = false). (S3_ERROR)