| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by gknight 4554 days ago
	Have you tried Apache Hive? I believe it was meant to make Hadoop easier to use by way of SQL-like commands. Something like Qubole might be able to help too.

3 comments

lmm 4553 days ago

I have, I should've mentioned that as an option. But I find it much easier to think in Scala than in an SQL-like language.

link

pavanred 4554 days ago

Pig is another option. It allows using SQL like commands on the grunt shell, making using Hadoop a lot easier.

link

rxin 4554 days ago

Going from Hive/Pig to Spark enables substantial improvement in developers' productivity (for non-reporting/BI workloads). You can properly unit test your program, use a debugger, and have all your code in the same place in the same language (rather than in the case of Pig, write UDFs in Java and then use a pseudo-scripting language for workflow specification).

All of these are just productivity gains; not to mention the performance gains you get when you go from MapReduce to Spark.

link

ianburrell 4554 days ago

There is Shark which is Hive implemented on top of Spark instead of Hadoop.

link