Hacker News new | ask | show | jobs
by melted 3850 days ago
Because the actual analytics (the part that BigQuery provides) is maybe 20% of this solution, and judging by the slides their ETL is very easy to use. What would I even use on Google Cloud to do ETL? Dataflow? Javascript UDFs? Something else? All of that seems clunky compared to what these guys are offering. And they have a bunch of data sources available "out of the box" that would be a hassle to deal with manually.

Another issue with BigQuery seems to be unpredictability of cost. One typo somewhere and you can easily run up a bill in tens of thousands of dollars because your dashboard isn't caching something. In a similar situation Redshift will merely get slow.

1 comments

Clunky ETL: Please correct me if I'm wrong, but what I saw about ETLs in the article "One example we encounter quite often is that Mixpanel stores timestamps in seconds, while Redshift expects timestamps in milliseconds." - that kind of transformations I would much rather run inside BigQuery in a couple seconds than going through a whole pipeline. Other things I could outside, just as what they are doing now - but I didn't see any RS specific advantages for transformations?

Cost: Cost should be way below other solutions - and to prevent problems BigQuery now has cost controls at a user and project levels: https://cloud.google.com/bigquery/cost-controls

Thanks for your comments!