| Not sure if this provides any insight or value. But I had this same experience. First exapmle was connecting Iterable - which looks like Airbyte supports - to bigquery. In the past I had someone help me setup snowflake which was too complicated for me to maintain / understand myself especially AWS is too complicated for me compared to simpler google cloud. Have also tried stich and fivetran at different times. Mostly to try to save time setting up non webhook syncs from FB marketing, Front. The volume of iterable data would be way hugely prohibitably expensive for us on those as paid platforms. In the end I was able to do FB Marketing myself less than 1k lines of python modified from a script I found on github which used google cloud scheduler & function. I don't know python so that felt somewhat satisfying. Another nuance in favor of a hosted/paid platform is that it looks like airbyte uses an api lookup sync instead of webhooks. That lets Airbyte get some more meta data to join to that I don't collect. That's valuable! For iterable I ended up making a GAE app to accept incoming webhook data -> push to pubsub -> pushes to function -> which writes to bigquery. The latency for bq writes was too much to try and do it all at once and i don't think iterable does webhook retries. Also Iterable is MEGA bursty like I've seen our GAE will scale up to somethings 40+ instances within minutes after we hit send on a campaign. That was the hardest problem to figure out getting the latency down for cold starts and scaling, cloud functions didn't work. It's not perfect but it's good enough for our needs. The simpler FB function grabs data 100% correct each day which feels good last I talked to some of the paid ETL it was flat $500 minimum a month not worth it. From learning all this I've been able to reuse this gae, pubsub, function, bq/spanner pattern for other stuff I build and it has saved a lot of time and headache. |