Hacker News new | ask | show | jobs
by mafintosh 2226 days ago
Yep! That is one of the things we’ve worked the hardest on. Completely new indexing structure, using an append-only hash trie which scales really well. We’ve tested it with many big datasets including importing all of Wikipedia as files in a single folder. Worked like a charm :)
1 comments

this one? https://dumps.wikimedia.org/other/static_html_dumps/current/... how long does it take to import it?
I think it was that one yes. Can’t remember the exact time it took, as we ran it over a couple of days due to some unrelated computer issues.