|
|
|
|
|
by aashu_dwivedi
951 days ago
|
|
Do you mean it's faster when the embeddings are pre-computed or is it faster when the embeddings are computed on the fly as well.
Also, what's the recommended way to store the colbert embeddings as, because of the 2d nature of the embeddings it's not practical to store in a vector database. |
|
Documents and queries embeddings can be obtained using .encode_documents and .encode_queries methods
I save most of my embeddings (python dictionnary with documents id as key and embeddings as values) using joblib in a Bucket in the cloud. I don't really know if it's a good pratice but it does scale fine to few millions documents for offline (no real-time) applications.