| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by vlahmot 3205 days ago

You don't need Redshift and it's not really the best for "combining data".

I'd throw the data on S3, do the processing in spark(you can likely run on one node in local mode for now at that scale and scale as the data does), write the data back to s3, load that processed/aggregated data from s3 into mysql since you running that already and can just plug in your BI tools.

Much easier to process data not in the db, s3 as a source of truth is great in AWS, and much cheaper.