Hacker News new | ask | show | jobs
by kderbe 939 days ago
The article disappointed me and did not live up to its "at scale" title. At best it repeats the happy-path official documentation, and where it deviates it's for the worse.

I have spent some time administering Redshift. I agree with all gregw2's points. Except for leader node CPU utilization which I've never seen go above 20% even momentarily.

A tip: prefer a larger RA3 node size to having more nodes, e.g. two ra3.16xlarge instead of eight ra3.4xlarge. (This goes against the advice I received from AWS support.) This will give your cluster a beefier leader node for free. Larger worker nodes can also help with a skewed query that would be starved for RAM on smaller nodes--some of my biggest analytical queries tripled in execution time when I tried them on a same-cost cluster with a smaller node size.

1 comments

Agree with your tip about larger RA3 node sizes. An advisor at AWS during a resizing POC clued me into it and having switched one big cluster to ra3.16xlarge, the leader node CPU utilization has definitely dropped for our peak Monday morning workloads.

(Yes, with a very large ra3.4xlarge cluster with substantial Redshift Concurrency Scaling, our leader node was in the peak-most hour nudging above 50%.)

I'm still not clear whether switching to SNAPSHOT_ISOLATION over SERIALIZABLE actually helped or hurt our huge workload. Anecdotally we sort of traded fewer deadlock-killed queries for longer-locked queries which may have grown our overall expenses.

One minor thing I've anecdotally noticed recently that seems to have changed at some point, perhaps around the time of our transition to ra3.16xlarge, not sure: in the old days, Redshift Concurrency Scaling seemed to kick in the minute that one query was queued up. Nowdays it seems to kick in about 5 minutes later.

I'd prefer it if AWS added some ability to only activate concurrency scaling when a queue has X queries queued or queries queued for Y minutes or something.