Hacker News new | ask | show | jobs
by fauigerzigerk 4833 days ago
Sadly, they don't publish up-to-date HTML dumps and there is no reliable way of reproducing them short of installing the entire wikipedia system locally, including the database. I know there are quite a few projects that claim to do it but they're all abandoned, incomplete or unsuitable in various other ways (as far as I know).
2 comments

Not sure if this helps, but there's the Kiwix offline Wikipedia reader and its associated Zim files:

http://www.kiwix.org/

http://download.kiwix.org/zim/0.9/

http://www.kiwix.org/wiki/Tools/en#Generating_ZIM_Files_From...

http://www.openzim.org/

"The entire Wikipedia system, including the database" is slightly overblown: it's just MediaWiki and MySQL.
MediaWiki (+ a finely tuned PHP system, because mediawiki is unusably slow without it), MySQL, and plenty of time / resources for doing the large import of enwiki (since that is what most people are interested in).
Exactly, and at that point you still don't have the static HTML files. You have to crawl the entire local site, which takes ages. Then you have to repeat all of this according to your desired update frequency.
EC2 + S3

Minimal charge to cover the bandwidth and hosting costs. Any profit donate to Wikipedia Foundation.

Solves that same problem for lots of people.