|
|
|
|
|
by ivanprado
5220 days ago
|
|
Hi, I'm one of the developers of Pangool. The idea of Pangool is not to be yet another higher level API on top of Hadoop but rather to pose a replacement for the low-level Hadoop Java MapReduce API. Pangool has the same performance and flexibility than that of the Java MapReduce API although it makes several things a lot easier and convenient. There is no tradeoff, just advantages. There will be cases where you'd want to use Pig or Cascading. There will be some other cases where you'd want the flexibility and efficiency of MapReduce. For those cases we conceived Pangool. Nowadays only very advanced Hadoop users could write efficiently-performing MapReduce Jobs. Pangool hides all the advanced boilerplate code needed for writing highly efficient MapReduce jobs, making things like secondary sorting or reduce-side joins extremely easy. |
|
Though I don't have deep expertise in Hadoop, I find this claim highly suspect. High-level APIs achieve user-friendliness by making decisions/assumptions about the way a lower-level API will be used. I would be very surprised if there was no use case for which your API does impose a trade-off vs. the low-level Hadoop API.
I feel much more confident using a high-level API if its author is up-front about what assumptions it's making. If the claim is that there is no trade-off vs. the low-level API, I generally conclude that the author doesn't understand the problem space well enough to know what those trade-offs are.
I could be wrong, but this is my bias/experience.