|
|
|
|
|
by theLiminator
155 days ago
|
|
Yeah, i'm also similarly confused. > "SQL should be the first option considered for new data engineering work. It’s robust, fast, future-proof and testable. With a bit of care, it’s clear and readable." (over polars/pandas etc) SQL has nothing to do with fast. Not sure what makes it any more testable than polars? Future-proof in what way? I guess they mean your SQL dialect won't have breaking changes? |
|
My current habit is to suck down big datasets to parquet shards and then just query them with a wildcard in duckdb. I move to bigquery when doing true “big data” but a few GB of extract from BQ to a notebook VM disk and duckdb is super ergonomic and performant most of the time.
It’s the sql that I like. Being a veteran of when the world went mad for nosql it is just so nice to experience the revenge of sql.