Hacker News new | ask | show | jobs
by hannasanarion 300 days ago
I don't understand how "using an index" is a solution to this problem. If you're doing search, then you already have an index.

If you use your index to get search results, then you will have a mix of roles that you then have to filter.

If you want to filter first, then you need to make a whole new search index from scratch with the documents that came out of the filter.

You can't use the same indexing information from the full corpus to search a subset, your classical search will have undefined IDF terms and your vector search will find empty clusters.

If you want quality search results and a filter, you have to commit to reindexing your data live at query time after the filter step and before the search step.

I don't think Elastic supports this (last time I used it it was being managed in a bizarre way, so I may be wrong). Azure AI Search does this by default. I don't know about others.

1 comments

> I don't understand how "using an index" is a solution to this problem. If you're doing search, then you already have an index

It's a separate index.

You store document access rules in the metadata. These metadata fields can be indexed and then use as a pre-filter before the vector search.

> I don't think Elastic supports this

https://www.elastic.co/docs/solutions/search/vector/knn#knn-...