|
|
|
|
|
by navar
159 days ago
|
|
For the retrieval stage, we have developed a highly efficient, CPU-only-friendly text embedding model: https://huggingface.co/MongoDB/mdbr-leaf-ir It ranks #1 on a bunch of leaderboards for models of its size. It can be used interchangeably with the model it has been distilled from (https://huggingface.co/Snowflake/snowflake-arctic-embed-m-v1...). You can see an example comparing semantic (i.e., embeddings-based) search vs bm25 vs hybrid here:
http://search-sensei.s3-website-us-east-1.amazonaws.com
(warning! It will download ~50MB of data for the model weights and onnx runtime on first load, but should otherwise run smoothly even on a phone) This mini app illustrates the advantage of semantic vs bm25 search. For instance, embedding models "know" that j lo refers to jennifer lopez. We have also published the recipe to train this type of models if you were interested in doing so; we show that it can be done on relatively modest hardware and training data is very easy to obtain: https://arxiv.org/abs/2509.12539 |
|
I don't know if this is too much to ask, but something that would really help me adopt your model is to include a fine-tuning setup. The BGE series of embeddings-models has been my go-to for a couple of years now -- not because it's the best-performing in the leaderboards, but because they make it so incredibly easy to fine-tune the model [0]. Give it a JSONL file of a bunch of training triplets, and you can fine-tune the base models on your own dataset. I appreciate you linking to the paper on the recipe for training this type of model -- how close to turnkey is your model to helping me do transfer learning with my own dataset? I looked around for a fine-tuning example of this model, and didn't happen to see anything, but I would be very interested in trying this one out.
Does support for fine-tuning already exist? If so, then I would be able to switch to this model away from BGE immediately.
* [0] - https://github.com/FlagOpen/FlagEmbedding/tree/master/exampl...