|
|
|
|
|
by dguaraglia
3709 days ago
|
|
My 2 cents: I would not recommend basing any new work on MRjob. As someone who inherited and has been maintaining a bunch of code that depends on it, the library seems to be barely maintained, support for VPC is only partial and not very well documented, the auditing tools stopped working quite a while ago and tracking the progress/status of EMR jobs is extremely painful (to be fair, this is more of an issue with Elastic MapReduce than MRJob itself.) I love the concept and ease of development, but I can't shake the feeling that the infrastructure is so shaky it almost amount to instant technical debt (sorry if this offends anyone, I'm just a dumb customer.) |
|
[1]: https://github.com/Yelp/mrjob/releases