Hacker News new | ask | show | jobs
by Philpax 361 days ago
The Wikipedia data dumps [0] are multistream bz2. This makes them relatively easy to partially ingest, and I'm happy to be able to remove the C dependency from the Rust code I have that deals with said dumps.

[0]: https://meta.wikimedia.org/wiki/Data_dump_torrents#English_W...