Hacker News new | ask | show | jobs
by qoega 1468 days ago
If you will use S3 as a storage for cold data it can be become just S3(or other object storage) cost. And ClickHouse compression rates for Twitter-like data can be 10x or better as you do not store raw data and data in stored in columns.

Concerning ingestion costs, I think it took me ~10 hours of 4vCPU vm last time, when I loaded 1.5 TB data to ClickHouse from this dataset: http://toddwschneider.com/posts/analyzing-1-1-billion-nyc-ta... 1.1 billion rides became 3.5 billion already.

1 comments

Interesting. Would you mind talking offline (email in my bio). Would love to take some pointers