How would you host sentence-transformers model for free? You need it to vectorize each query so that has to be hosted somewhere. Is there any way to do it for free?
Just run it on CPU, on your own machine. That's the cheapest way. You could also rent a free/cheap VPS, and even parallelize across multiple machines/cores if you need it.
Maybe I'm grumpy today but I am shocked at how many responses you are getting where people think this is a novel idea. Has the engineering mindset really shifted into a default of "buy" even when build could take less than a week?
I was surprised, too, but then I realized they all work at Qdrant.
But the general dialogue around AI-related tools is surprising to me. The production parts of the langchain, embeddings, etc tools can usually be built in a few hours with better observability, performance, and maintainability.