|
|
|
|
|
by mrtimo
280 days ago
|
|
I agree with this 100%. The creator of duckdb argues that people using pandas are missing out of the 50 years of progress in database research, in the first 5 minutes of his talk here [1]. I've been using Malloy [2], which compiles to SQL (like Typescript compiles to Javascript), so instead of editing a 1000 line SQL script, it's only 18 lines of Malloy. I'd love to see a blog post comparing a pandas approach to cleaning to an SQL/Malloy approach. [1] https://www.youtube.com/watch?v=PFUZlNQIndo
[2] https://www.malloydata.dev/ |
|
That's pandas. Polars builds on much of the same 50 years of progress in database research by offering a lazy DataFrame API which does query optimization, morsel-based columnar execution, predicate pushdown into file I/O, etc, etc.
Disclaimer: I work for Polars on said query execution.