|
|
|
|
|
by jhj
879 days ago
|
|
Speaking as an author of one of the primary libraries for doing this stuff (faiss), it is not because it is still an open ended research problem on how approximate high-dimensional dense or sparse nearest neighbor should work, let alone maximum inner product search where the research story is even worse, or other non-metric space similarity measures. All of the current techniques still have quite unacceptable tradeoffs involved. While traditional database indexing is also still an open-ended research problem (e.g., read amplification/write amplification tradeoffs and the like), it produces exact solutions. That isn't the case at all for vector indexing beyond brute-force search, or exact indexing like k-D/BSP trees which don't work well in high dimensions due to the curse of dimensionality. |
|