|
|
|
|
|
by viralbajaria
4881 days ago
|
|
I agree with your points that PostgreSQL (or RDMS in general) is really good for certain type of reporting / analytics use cases while hadoop/hive is awesome for handling billions or rows + TBs of data. How was your overall experience with impala ? Did you guys have a fairly new hive cluster to try it out or did you just spin up a new one since impala can only read certain file formats (i.e. no custom SerDe). Also, for hive/hadoop datasets, is that more for just data exploration, while this PostgreSQL solution is for smaller datasets which return in a few seconds and would not perform well in hive due to the cost of setting up a mapreduce job ? |
|