Hacker News new | ask | show | jobs
by dgomez1092 4104 days ago
I'm curious to know more about why there would be greater difficulty using MapReduce jobs to be able to go through a K-Means clustering analysis. As for as the computational effort on the CPU what would be the cost to your RAM's mapper when you access this remote file with all the iterative data; I'm assuming I could make an implication that Apache spark could be better since it allows for actually in memory processing. Is there really a significant difference in computational usage in mbps?
1 comments

k-means clustering is iterative while standard mapreduce is single pass