| HN Mirror

By assuming "usual embedding" meaning using the default model, which generally is "all-MiniLM-L6-v2", I used MixedBread's embedding model because of this [^1].

You can evaluate how well a model is doing by subjectively going through some search results for papers you have a good grasp on. Another way I look at is to see the 2D "maps" of the embeddings and how well these are segregated, see [^2].

[1]: https://www.mixedbread.ai/blog/binary-mrl [2]: https://raw.githubusercontent.com/mitanshu7/dumpyard/refs/he...