| Simple answer - you normally store text in it, but with the state of neural networks these days most things can be vectorized and searched. So, coming myself from a database background but working in search, the SELECT statement (and joins) probably aren't the best way to get your head wrapped around things. I would think of the vector as a unique key for a record, and only using a LIKE statement for all my queries, but one that will return a probability of a match instead of an actual match. A great use case is to think about similarity, where we want the things that are closest to what we want to see, but there isn't an exact match. For example; a user gives me a sentence that says, "How long do I have to be with the company before I get a 401K match?". My vector store has a bunch of vectors including "A new employee will be eligible for 401K after 6 months." ,and, "The 401K program is run by <MEGACORP X>." I would like to be able to see that the first vector is a closer match to the user sentence than the second, and by how much. I would also like to do this without having to change my code much based on the structure of the text. Luckily, there is a very simple algorithm for doing this (cosine similarity) that doesn't change regardless of the sentence structure or the question answered. Also, it doesn't matter what kind of question/answer you do as long as it can be vectorized, so you could even give me a vector representing an image and I can give you an image that is most similar. Here is the most interesting thing about vectors -- with very little effort they turn the english language into a programming language. Instead of typing "SELECT document_id, document_name, document_body FROM documents WHERE (document_body LIKE '%401K%' AND document_body LIKE '%match%' AND document_body LIKE '%existing employee%') FROM documents" I can just ask, "How long do I have to be with the company before I get a 401K match?" and I will get back a result and a match probability. How I change my text will change the matches, and can do so in ways that are profound and unexpected. Note that the SQL query I gave would not return any values because I didn't have any documents that had the term "existing" in them. Building the correct SQL query could be quite complex it comparison to just using the text. This is pretty great for long-tailed search, q&a, image search, recommendations, classification, etc. BTW, I am biased, I work for Elastic (makers of Elasticsearch) and we have been doing traditional search forever, and vector/hybrid search for the last few years. |