Hacker News new | ask | show | jobs
by iblaine 4868 days ago
What this test is essentially doing is comparing Postgres against a single node of Redshift. It is not surprising that Postgres is faster. But Redshift is not meant to be used on a single node.

What Postgres & Redshift represent are are two different products for two very different problems. Postgres is good for small sets of transactional data like orders in a shopping cart system (less than 1TB). Redshift is good for big sets of data involving user behavior and clickstream analysis (greater than 1TB). I would not want to manage clickstream data on a single instance of Postgres nor would I want to manage an order system in Redshift.

A better test of Redshift would be to see how it compares to Asterdata...particularly with both in AWS. That should be telling.

1 comments

We don't run a shopping cart, but one of our databases at present is at 11.3TB on PostgreSQL 9.1 and we're by no means dealing with small sets. We routinely juggle several Gigs at a time when we need to do analytics. We didn't see a reason to put this on a cloud since bandwidth + electricity is still cheaper for us than bandwidth + storage in the cloud at present.
If you have a few servers to spare, I'd recommend installing Cloudera Impala on them. You can use Apache Sqoop to pull the data out of Postgres and into HDFS.. Directly after, you can run SQL queries which will query the data in parallel (similar to redshift).