| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by traverseda 1229 days ago

I think that they might have fixed it. I noted this as a problem with earlier meilisearch releases as well, but reading through the documentation it looks like they don't require the entire index to be in memory any more, allowing it to be a memory mapped file.

https://docs.meilisearch.com/learn/advanced/storage.html#lmd...

>For the best performance, it is recommended to provide the same amount of RAM as the size the database takes on disk, so all the data structures can fit in memory.

> [...]

>It is important to note that there is no reliable way to predict the final size of a database. This is true for just about any search engine on the market—we're just the only ones saying it out loud.

Looks like a 10MB document is taking ~200MB, from their docs. I don't think that scales linearly though, since it's a reverse index it is going to scale based on the number of unique words it finds, with each document adding a bit on top of that. You'd expect it to have a pretty big index to cover common english words, and then each document adds a bit on top of that.

Definitely seems like somewhere they could make some improvements though. Some transparent compression could probably help, and with zstd's dictionary feature it can be fine tuned to the data they're actually seeing.

Not about to replace xapian in kiwix (offline wikipedia reader) any time soon, I think.