Hacker News new | ask | show | jobs
by frew 3375 days ago
Based on random use over the past few years.

Redshift: Pros: Has the most adoption, so most integrations from SaaS services etc. are built with Redshift as their sink. Relatively fast and battle-tested.

Cons: In an awkward middle ground where you're responsible for a lot of operations (e.g. capacity planning, setting up indexes), but don't have a lot of visibility. Some weirdness as a result of taking PostgreSQL and making it distributed.

BigQuery: Pros: Rich feature set. Pay-per-TB pricing. Recently released standard-ish SQL dialect. Very fast.

Cons: JDBC driver is recent and doesn't have support for e.g. CREATE TABLE AS SELECT (as of a couple of months ago) so harder to integrate with existing systems. There are ways to run out of resources (e.g. large ORDER BY results) without a good path to throw more money at the problem.

Athena: Pros: Built off of the open-source Presto database so can use the documentation there. Pay-per-TB pricing.

Cons: Slower than the other options listed here. Very early product so lacking some in documentation and some cryptic errors. Not a lot of extensibility, but you could theoretically move to just using open-source Presto.

Haven't had a chance to evaluate Snowflake.

1 comments

Just a question, which of these does have a good BI tool support?
All of the above data warehouses have good support. You can use almost any popular BI tool with all of them. All of the above have ODBC/JDBC drivers that BI tools can use. And since these are the most popular data warehouses out there, most BI tools implement their connectors.