Hacker News new | ask | show | jobs
Mining of Massive Datasets (mmds.org)
109 points by markhkim 3502 days ago
5 comments

This coursera course was taken down, but it is now up and running at lagunita.stanford.edu [0], which uses edx's open source platform [1]. The same happened to other stanford courses previously on coursera, you can find them here [2], including Compilers, Automata Theory, and Convex Optimization.

[0] https://lagunita.stanford.edu/courses/course-v1:ComputerScie...

[1] https://open.edx.org/ https://github.com/edx/edx-platform

[2] https://lagunita.stanford.edu/courses

What amazed me is how much of this is 1990s stuff.
How do you mean? Do you know of any up-to-date books on large scale data mining?
massive is an understatement. I have only dealt with puny GB sized data sets. They deal with vectors which cannot fit into main memory.
Yes, in general what they refer to are things like the IRS Tax records (250 TB), Yahoo Ad data (900 TB). You just can't use a single machine to work with such data.
Deadlines are at the 29th of November. Best of luck
tldr; parallel map reduce. ;)