|
|
|
|
|
by etiennedi
1896 days ago
|
|
You might be interested in checking out Weaviate (https://github.com/semi-technologies/weaviate) which supports combining vector search with boolean filters. In Weaviate, the vector index is stored alongside an inverted index, so you get the best of both worlds. At the moment on a filtered search, the inverted index is hit first, so you essentially end up with an allow-list of IDs which are then fed to the vector index. If the id isn't on the allow list the next neighbor is selected and so on. There are also plans to further optimize this based on how restrictive the boolean filter is. For example, if the filter still matches a lot of of ids (e.g. 50% of all candidates) the approach outlined above works really well. But if if your filter is extremely restrictive and maybe only matches 1k out of 100M results, then this approach is no longer ideal, as the vector index will discard most of the ids it sees. However, in this case it would be more efficient to actually perform a brute-force search on those 1k vectors. This and other optimizations around combined vector and scalar search are on the roadmap. |
|