Hacker News new | ask | show | jobs
by yobbo 1751 days ago
What in your system is doing the text-to-vector encoding, and how did you train it?
2 comments

We're using Transformers with `sentence-transformers/paraphrase-distilroberta-base-v1` model.

The framework is Jina (https://github.com/jina-ai/jina/) so it's pretty high-level. You can see the indexing/search Flow on lines 37-52 of https://github.com/alexcg1/jina-meme-search-example/blob/mai...

We rely on pre-trained models at the moment, since Jina supports loads of them out of the box.

For image search we use Big Transfer Encoder (https://github.com/jina-ai/executors/tree/main/jinahub/encod...) but may switch to CLIPImage encoder at some point