Hacker News new | ask | show | jobs
by sharun 3352 days ago
You could call it version 1.0 of the system Neo uses to learn everything about everything in the Matrix :)

I don't think NLP is necessary though.

If you take a look at the stackoverflow data dump sooner or later every possible variation of a question for a particular answer is going to get asked. Ofcourse there are always new topics that haven't been covered but this is a minuscule portion of the entire data set. I think it's a safe bet looking at Q&A happening on sites like Quora\Reddit etc in a couple years the chances that someone is going come up with a question that no one has asked before is going to be very low.

Wikipedia is missing 2 important pieces.

1. Linkage between all these questions and the content on the site. Currently Google provides this link.

2. A system to communicate to the reader what skill level /pre-reqs they need to fully appreciate/understand the content they are looking at. The UI for such a system already exists in most games.

Once these pieces are in place you are ready to create a very useful Anki deck on anything.

What most people don't realize is the entire mass of human knowledge given the size of most wiki/q&a site dump is about 100-150 GB. Throw in all the edu video content being produced on Khan Academy, NPTEL etc and you get to a 1-2TB. This isn't a big amount of data. All it needs is a learning system built on top of it for it all to be put to good use.