Hacker News new | ask | show | jobs
by treprinum 874 days ago
Is Open AI using it in their assistants API for retrieval? Answer performance of those is really bad and retrieval is slow compared to Pinecone.
1 comments

Yes, it's used for the RAG implementation - though we only know that due to information leaked in an error message I believe: https://twitter.com/altryne/status/1721989500291989585
Simon, I am always amazed how you are able to keep up with so much so quickly!!
Hard to believe OpenAI uses Quadrant when they are backed by Microsoft, thus having Azure Cognitive Search (now "AI" Search)
Cognitive Search is nowhere as good as a 'pure' vector DB. Behind the scenes, it's a managed elasticsearch/opensearch with some vector search capabilities. The 'AI' implementations I've done with Cognitive Search always boil down to hybrid(vector+fts) text search.
In context of RAG, the goal is not to have a pure vector DB but to have all the relevant data that we can gather for a user's prompt. This is where Cognitive Search and other existing DBs shine because they offer a combination of search strategies. Hybrid search on Cognitive Search performs both full text and vector queries in parallel and merges results which I find a better approach. Further, MS is rebranding Cognitive Search as Azure AI Search to bring it more in line with the overall Azure AI stack including Azure OpenAI.
Cognitive Search already contains hybrid search (vector + BM25 + custom ML reranking) and they use chunks of 2048 tokens with a custom tokenizer. So it should be now better than most vector DBs. One could probably make something better by using some version of SPLADE instead of BM25 but their secret sauce lies in their custom ML model for reranking that gives them the largest search performance boost.
Do you have any experience in AI search to compare it to other products?

I’m genuinely curious to know if it’s any good.