Y
Hacker News
new
|
ask
|
show
|
jobs
by
vlahmot
3177 days ago
We do the EMR backed by s3 setup, only with snappy over gz as gz can't be split.
1 comments
mattbillenstein
3177 days ago
Ah, word, do you roll up the data by day? Or hour? I think in a situation where you roll it up by hour and you have a lot of files, it can be spread out pretty evenly on a large cluster.
link