Hacker News new | ask | show | jobs
by paulmolloy 2655 days ago
I've been working on some research for a recommender system using xml wiki article dumps the last few months. I've been using mwparserfromhell as well to get plain text and some other metadata I needed from articles to create a dataset. It seems to work pretty well for that use case anyway.