|
|
|
|
|
by cevian
592 days ago
|
|
(post co-author here) The DB is the right layer from a interface point of view -- because that's where the data properties should be defined. We also use the DB for bookkeeping what needs to be done because we can leverage transactions and triggers to make sure we never miss any data. From an implementation point of view, the actual embedding does happen outside the database in a python worker or cloud functions. Merging the embeddings and the original data into a single view allows the full feature set of SQL rather than being constrained by a REST API. |
|
It is certainly convenient for the end user, but it hides things. What if the API calls to open AI fail or get rate limited. How is that surfaced. Will I see that in my observability. Will queries just silently miss results.
If the DB does the embedding itself synchronously within the write it would make sense. That would be more like elastic search or a typical full text index.