Hacker News new | ask | show | jobs
by ThomasMoll 909 days ago
We (when I worked at LinkedIn) did it with ETL clusters, we already had built them out for moving data between datacenters nightly. They would mirror an HDFS cluster, then ran batch jobs to transfer either directly to the outbound cluster or to another ETL cluster in another DC.

We used one of our ETL clusters to ship data to MSFT for various LinkedIn integrations, like seeing LinkedIn profile information in Outlook or Office products.

1 comments

Which tools were you using for ETL? Or were they completely custom?