Hacker News new | ask | show | jobs
by d_watt 1218 days ago
Ah, almost like a little bit of a lower level trino, where DuckDB is able to push out queries to different data storage endpoints?
1 comments

Unfortunately not. At least not without a little intervention. See this blog post for more details about what I mean. They inspect the iceberg table's catalogue to list the related parquet files and then load them into duckdb.

https://www.arecadata.com/getting-started-with-iceberg-using...

You would still need to interact with some kind of catalogue to understand which .db files you need to fetch.

And honestly I don't really know or understand the performance implications of the attach command.

I'm excited to see if the duckdb team will be able to integrate with external tables directly one day. (not that data files would be .db files)

Imagine this:

1) you have an external managed external table (iceberg, delta, etc... managed by Glue, databricks, etc)

2) register this table in duckdb

    CREATE OR REPLACE EXTERNAL TABLE my_table ... TYPE = 'ICEBERG' CATALOG = 's3://...' CREDENTIALS = '...' etc
3) simply interact with table in duckdb as you would any other table