Hacker News new | ask | show | jobs
by pavanred 4554 days ago
Pig is another option. It allows using SQL like commands on the grunt shell, making using Hadoop a lot easier.
1 comments

Going from Hive/Pig to Spark enables substantial improvement in developers' productivity (for non-reporting/BI workloads). You can properly unit test your program, use a debugger, and have all your code in the same place in the same language (rather than in the case of Pig, write UDFs in Java and then use a pseudo-scripting language for workflow specification).

All of these are just productivity gains; not to mention the performance gains you get when you go from MapReduce to Spark.