Hacker News new | ask | show | jobs
by rocheio 1666 days ago
> What I like most about it is how easy it is to achieve something useful with a very moderate amount of code.

100%. One of the best things about both Wikipedia and Python IMO, neither may deliver perfect results but they get you WORKABLE results very quickly.

I was also delighted reading this article about writing a Python parser for Wikipedia on a Jekyll blog... because I did an eerily similar thing ~5 years ago and it's still my most starred repo - https://roche.io/2016/05/scrape-wikipedia-with-python. Small world :)

Best of luck with the project! On one hand it seems impossible with all the irregularities in article structure and being able to QA the long-tail of niche topics. But on the other if you can manage to wrangle 99% of it into a reliable query language... that can mean a lot to many other side projects!