Hacker News new | ask | show | jobs
by conflagration 5534 days ago
Mahout & EMR is a powerful combination and I can also recommend boto library for managing it. If you are using these together, be sure to use hadoop version 0.20 when running a jobflow. If you are dealing with explicit data, like star ratings, Pearson correlation might get you better results.