|
|
|
|
|
by pradn
751 days ago
|
|
I'm with you on this. The vector embedding craze seems to be confusing mechanism and problem. The problem is semantic similarity search. One mechanism is vector embedding. I think all this comes from taking LLMs as a given, seeing that they work reasonably well with phrase-input semantic retrieval, and then hyper-optimizing vector embedding / search to achieve it. Are there other semantic search systems? What happened to the entire field of Information Retrieval - is vector search the only method? Are all the stemming, linguistic analysis, all that - all obsoleted by vectors? Or is it purely because vector search is quick? That's just an engineering problem. I'm not convinced it's the only method here. Happy to be corrected! |
|
My sense is that you can currently break the whole thing down into two groups: the proverbial grownups in the room are typically building pipelines that are still doing it basically how the top-performing systems did in the '90s, with a souped up keyword and metadata search engine for the initial pass and an embedding model for catching some stuff it misses and/or result ranking. This isn't how most general-purpose search engines work, but it's likely how the ones you don't particularly mind using work. Web search, for example.
And then there's the proverbial internet comments section, which wants to skip past all the boring labor-intensive oldschool stuff, and instead just begin and end with approximate nearest neighbors search using an off-the-shelf embedding model. The primary advantage to this approach - and I should admit here that I've tried it myself - is that you can bodge it together over a weekend and have the blog post up by Monday.
I guess what I'm getting at is, the people producing content on the Internet and the people producing effective software aren't necessarily the same people. I mean, heck, look at me, I'm only here to type this comment because I'm slacking off at work today.
1: https://www.oreilly.com/radar/what-we-learned-from-a-year-of...