| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by onemoreaccount 4602 days ago
	We use Redshift in production to power a customer-facing app, but don't have web apps hit it directly (that really couldn't work with the concurrent query limits). Workers query RS and cache results in another database and the web app hits that database. We handle writes in the low 10s of billions per day. Our data format is very simple and we get great compression, so we currently fit everything on two clusters — one of 3 XLs and one of 6 XLs. Performance is great — except when it's not. Simple aggregations on tables that are < 1B rows and two table joins on tables that are < 100M rows are blazingly fast (maybe 1-30 seconds depending on the query). Larger tables than those and it can start to crawl.