Hacker News new | ask | show | jobs
by falling_myshkin 831 days ago
been a lot of these RAG abstractions posted recently. As someone working on this problem, it's unclear to me whether the calculation and ingestion of embeddings from source data should be abstracted into the same software package as their search and retrieval. I guess it probably depends on the complexity of the problem. This does seem interesting in that it does make intuitive sense to have a built-in db extension if the source data itself is coming from the same place as the embeddings are going. But so far I have preferred a separation of concerns in this respect, as it seems that in some cases the models will be used to compute embeddings outside the db context (for example, the user search query needs to get vectorized. why not have the frontend and the backend query the same embedding service?) Anyone else have thoughts on this?
1 comments

Its certainly up for debate and there is a lot of nuance. I think it can simplify the system's architecture quite a bit of all the consumers of data do not need to keep track of which transformer model to use. After all, once the embeddings are first derived from the source data, any subsequent search query will need to use the same transformer model that created the embeddings in the first place.

I think the same problem exists with classical/supervised machine learning. Most model's features went through some sort of transformation, and when its time to call the model for inference those same transformations will need to happen again.