Dale from ClickHouse wrote a pretty extensive blog series on the Hacker News dataset, ingest approach, some queries of interest...
Could be a good bit of reading alongside the project?
https://clickhouse.com/blog/getting-data-into-clickhouse-par...