Hacker News new | ask | show | jobs
by pradeepchhetri 895 days ago
I enjoy using clickhouse-local for parsing csv files. I generally hit situations where I need custom delimiter and custom parsing rules, I find it handles all of these edge cases very well. Recently I found that if my csv files are compressed, i don't even need to uncompress them, it auto-magically figures out the compression format and process it for me.
2 comments

Which one is better for querying CSV from both command line and programmatically?

1. SQLite 2. DucksDB 3. clickhouse-local

Definitely clickhouse ecosystem.

- For querying csv data from command line, I use clickhouse-local.

- For querying csv data programmatically using a library, I use chdb (embedded version of clickhouse)

- For querying large amount of csv data programmatically, I offload it to clickhouse cluster which can do processing in distributed fashion.

If you are looking from query performance perspective, this blog is useful: https://www.vantage.sh/blog/clickhouse-local-vs-duckdb

Didn’t know about them, seems interesting.