Hacker News new | ask | show | jobs
by mailxplorer 3010 days ago
Clucene was way too slow for body text, more than 1GB. I had my own header parser in C++ (though you can do that in Python easily).

I'm trying again on that 800GB with KISS DB (append-only hashtable), and Elasticsearch. Doesn't matter if GPL because it's a website.

1 comments

Do you mind sharing the code ? I think that is an interesting thing to see