Hacker News new | ask | show | jobs
by omnimike 1904 days ago
I’m really glad this exists and is open source. Two years ago I was working on a project that really could have benefitted from a strong ANN search engine but couldn’t find one. In the end I brute forced it (we only had 100k vectors to search, so with a few optimizations it was fine), but I always wished I could have solved it using something like this.
5 comments

There was a similar situation in a project I worked on, at the time FAISS/Annoy/SPTAG were considered but we were really looking for something more managed, think ElasticSearch rather than Lucene. As luck would have it we stumbled upon https://www.milvus.io/ and it certainly checks a lot of boxes. But I'm rather excited to see some competition in this space, it's a broadly applicable trick so I'll definitely take a look at Vald next time it comes up as well!
We recognize that Milvus is a strong competitor and we think Milvus is a great platform. Like Milvus, Vald will soon be sending pull requests to ann-benchmarks to show that it is one of the fastest ANN engines available over the network.
Annoy and ScaNN are two good options - ScaNN is too new to have solved your previous problem, but Annoy would have been great.

- https://github.com/spotify/annoy

- https://ai.googleblog.com/2020/07/announcing-scann-efficient...

With only 100k vectors, why didn’t you just use Annoy? It’s been around for many years and it is fast, reliable, super easy to use and handles pretty large data quite well too.

https://github.com/spotify/annoy

I looked at Annoy at the time but from memory it required rebuilding the index every time a new vector is inserted, which was an issue for the project I was working on.

I assume that Vlad takes care of the reindexing in the background, which would save a lot of the work.

Main author and architect of Weaviate (https://github.com/semi-technologies/weaviate) here. This real-time requirement was one of the major design principles from the get-go in Weaviate.

In Weaviate, any imported vector is immediately searchable, you can update and delete your objects or the vectors attached to the objects and all results are immediately reflected. In addition every write is written to a Write-Ahead-Log, so that writes are persisted, even if the app crashes.

We wanted to make sure that Weaviate really combines the advantages of AI-based search with the comfort and guarantees you would expect from an "old school" database or search engine.

FAISS?
Also Annoy and more recently ScaNN