Hacker News new | ask | show | jobs
by jabo 2004 days ago
OP - congratulations on shipping! WASM-powered search caught my eye.

Looks like the search index is downloaded and used locally in the browser, so this is as fast as search can get. One trade-off though is that you're limited to relatively small datasets. While this shouldn't be an issue for small-medium static sites, an index that needs to be downloaded over the wire will affect your page performance for larger sites / datasets.

In those cases, you'd want to use a dedicated search server like Algolia or its open source alternative Typesense [1]. Both of them store the index server-side and only return the relevant records that are searched for.

For eg: you'd probably not want to download 2M records over the wire to search through them locally [2]. You'd be better off storing the index server-side.

[1] https://github.com/typesense/typesense (shameless plug)

[2] https://recipe-search.typesense.org/

2 comments

Just a quick non-thought-through idea but would it be possible to build an index in a way that allows clients to download only parts of it based on what they search? I.e. the search client normalizes the query in some way and then requests only the relevant index pages. The index would probably be a lot larger on the server but if disk space is not an issue...?

(Though at some point you have to ask yourself what the benefits of such an approach are compared to a search server.)

Merkle Search Trees: Efficient State-Based CRDTs in Open Networks https://hal.inria.fr/hal-02303490/document

Peer-to-Peer Ordered Search Indexes https://0fps.net/2020/12/19/peer-to-peer-ordered-search-inde... (which adds useful context about the above)

Sorry for just dropping these links, I should already be asleep :)

I keep wanting to try that! Sharding the index would be really cool, and would definitely allow for a seemingly infinite index size. The barrier, as you were getting to, would be making sure index shard downloads could keep up with typing speed.

Definitely something I'll try experimenting with in the future, so... watch this space, perhaps?

You as well as your business partner are so creative at hijacking threads that it is almost cringeworthy :(

Typesense itself might be a great product but but please stop devaluating it in trying to hijack every search related post.

As an introvert and an engineer it’s definitely not the easiest for me to talk about something I’m working on openly. So I do cringe every time I mention Typesense in contexts like this thread.

But it seems like every time I mention it, new people discover it and there continues to be interest in the community for an oss search product. I’m sorry if you saw me mention Typesense too often.

IMO your post was fine and informative, I certainly appreciated it. As someone developing something similar, you would have thought about the problem a lot more than the casual reader and be able to provide insight others might not. I can relate to that.

However, it wasn't clear to me that you were involved with Typesense. Re-reading it now I spot one reference at the very end. If it was instead tied in with your first reference it would come across more honest and make your observations more appreciated.