Hacker News new | ask | show | jobs
by jsheard 410 days ago
> I tried using one of these dumps a year ago (wanted to play around and see what visualizations I could come up with based on text and the links between pages) and it was an incredibly unintuitive process.

More recently they starting putting the data up on Kaggle in a format which is supposed to be easier to ingest.

https://enterprise.wikimedia.com/blog/kaggle-dataset/

2 comments

More recently is very recently, not enough time yet for data collectors to evaluate changing processes.
Good timing to learn about this, given that it's Friday. Thanks! I'll check it out