|
|
|
|
|
by peterdstallion
268 days ago
|
|
This test was done on a small dev laptop with 16GB of RAM, scanning a 500M row (record) 23GB Parquet file. DuckDB proved to be 5x faster. A bit of an obvious one - small data tech is faster at small data. It serves more of a lower bound reminder of what "small data" is nowadays. The article rightly starts with: > Processing power on laptops has increased dramatically over the last twenty years. This allows single laptops to accomplish what we needed multi-node Spark clusters to do ten years ago. |
|