Hacker News new | ask | show | jobs
by kristiandupont 1229 days ago
I am using the core (called "Milli") in a local indexer that I run on my repositories and Obsidian files. It works like a charm and I am very happy with it. Obviously that's a use case with very little traffic but just indexing my repositories folder is quite a bit of work and it does it surprisingly fast.

The only real thing I am missing is a typeahead feature.

1 comments

Hello from a Meilisearch team member,

wow your project looks very interesting. How do you handle things like the filesystem changing while your indexer is offline? Do you reindex from scratch at startup?

Regarding typeahead, is this what we call "query suggestions"[1]? At the moment, we think that this is something that frontends and SDK can provide rather than the engine, so that means you wouldn't find it at the Milli level. We think you could maybe build an ancillary suggestion index and make two queries instead of one when typing, so as to get both results and suggestions at once.

Here's a chat link[2] to our latest discussions on the topic; feel free to come and weigh in if you're interested!

[1]: https://roadmap.meilisearch.com/c/31-query-suggestions

[2]: https://discord.com/channels/1006923006964154428/10685073658...

Thank you! Yes, I reindex. I store the file timestamp along with the contents, so it's not quite as involved as it could seem but startup does take a bit. And, I don't have a good way of discovering deleted files at the moment. Not a big deal as it is, but something I will look into.

And yes, query suggestions are exactly what I mean. Thank you for informing me, I guess I will have to look into how I can make it myself :-)

You could maybe use something equivalent to the "index hot swap"[1] feature we have at the Meilisearch level at startup, so that you make the reindexing in a another index at startup, and then atomatically swap this fresh index with the old one when it is ready? That way, you have fast startup at the cost of having possibly out-of-date information for a while after startup.

(you could even reindex from scratch completely in the background at startup, so no need to discover deleted files at all)

[1]: https://blog.meilisearch.com/zero-downtime-index-deployment/

That's a great idea, thank you!