Hacker News new | ask | show | jobs
by glogla 1258 days ago
In Databricks published benchmark of course Delta is the fastest. I have also seen some Iceberg using company publishing benchmarks showing how Iceberg is the fastest.

Vendor published benchmarks are worthless.

2 comments

I think vendor published benchmarks are fine if the dataset is open / accessible, the benchmark code is published, all software versions are disclosed, and the exact hardware is specified. I definitely wouldn't consider an audited TPC benchmark that's based on industry standard datasets / queries worthless in the data space. Disclosure: I work for Databricks.
fwiw - the lead authors on that linked paper are all grad students not employed at Databricks. That being said, they're advised by Databricks people