Hacker News new | ask | show | jobs
by zomglings 1861 days ago
These are not idle questions:

1. What do multi-source joins look like?

2. How expensive are they as a function of the sizes of the "tables" being joined?

1 comments

I should clarify, step 1 in most pipelines is pulling data out of the sources and replicating it in Snowflake. Then a multi-source join is a normal ANSI SQL join on literal tables in different schemas of the same database, not "tables".

(Some call this model "ETLT", where the first ETL part is just moving data from APIs or other databases into a shared db, and the extra "T" joining that data across sources or otherwise organizing it in useful ways.)

Thank you for your clarification.