Hacker News new | ask | show | jobs
by kaathewise 758 days ago
I was searching for a Meilisearch alternative (which sends out telemetry by default) and found Tantivy. It's more of a search engine builder, but the setup looks pretty simple [0].

[0]: https://github.com/quickwit-oss/tantivy-cli

3 comments

QuickWit also sends telemetry by default: https://quickwit.io/docs/telemetry
Hm, I am interested, but I would love to use it as a rust lib and just have rust types instead of some json config...

The java sdk of meilisearch was also nice, same: no need for a cli and manual configuration. I just pointed it to a db entity and indexed whole tables...

Would love that for tantivy

> Hm, I am interested, but I would love to use it as a rust lib and just have rust types instead of some json config...

Yes that's how you use tantivy normally, not sure which json config you mean.

tantivy-cli is more like a showcase, https://github.com/quickwit-oss/tantivy is the actual project.

Yes, and there is https://tantivy-search.github.io/examples/basic_search.html

But instead of this, I would prefer some way to just hand it JSON and for it to just index all the fields...

for comparison, this is my meilisearch SDK code:

    fun createCustomers() {
        val client = Client(Config("http://localhost:7700", "password"))
        val index = client.index("customers")
        val customers = transaction {
            val customers = Customer.all()
            val json = customers.map { CustomerJson.from(it) }
            Json.encodeToString(ListSerializer(CustomerJson.serializer()), json)
        }
        index.addDocuments(customers, "id")
    }
You can just put everything in a JSON field in tantivy and set it to INDEXED and FAST
Hm, I need to read up on the trade offs of going this route.

Thanks!

That's a petty objection to usable interactive search when it's easy to opt-out by adding a single command line argument.
OP is entitled to make political choices when selecting software.

Some of us have specific principles of which things like opt-out telemetry might run afoul.

OP will choose their software, I choose mine and you choose yours; none of us need to call each other petty or otherwise cast such negative judgement; a free market is a free market.

Irrational white-knighting rather than principled discussions doesn't add value here.
Suggesting you should be less judgemental is not white-knighting, nor is it irrational. Sorry bud, but not everyone thinks the way you do, different people have different principles.

Feel free to explain how either of the two comments of yours I've replied to represent principled discussion or added value, because I'm not seeing it.

It's a minor complaint, but I'm also evaluating it for a minor project. I just don't like the fact that I can forget to add a flag once and, oh, now I'm sending telemetry on my personal medical documents.
Meilisearch only sends anonymized telemetry events. We only send API endpoints usage; nothing like raw documents goes through the wire. You can look at the exhaustive list of all collected data on our website [1].

[1]: https://www.meilisearch.com/docs/learn/what_is_meilisearch/t...

also meilsearch is more like quickwit, their distributed offering but quickwit is AGPL
They serve quite different use cases.

quickwit was built to handle extremely large data volumes, you can ingest and search TB and PB of logs.

meilisearches indexing doesn't scale as it will become slower the more data you have, e.g. I failed to ingest 7GB of data.

Hey PSeitz, Meilisearch CEO here. Sorry to hear that you failed to index a low volume of data. When did you last try Meilisearch? We have made significant improvements in the indexing speed. We have a customer with hundreds of gigabytes of raw data on our cloud, and it scales amazingly well. https://x.com/Kerollmops/status/1772575242885484864
Frankly, I'm okay with Meillisearch for instant search because y'all are clear about analytics choices, offer understandable FOSS Rust, and have a non-AGPL license. If/when we make some money, I'm in favor of $upporting and consulting of tools used to keep them alive out of self-interest.