Hacker News new | ask | show | jobs
by xfalcox 202 days ago
I am partial to https://huggingface.co/Qwen/Qwen3-Embedding-0.6B nowadays.

Open weights, multilingual, 32k context.

2 comments

Also matryoshka and the ability to guide matches by using prefix instructions on the query.

I have ~50 million sentences from english project gutenberg novels embedded with this.

Why would you do that and I'd love to know more
The larger project is to allow analyzing stories for developmental editing.

Back in June and August i wrote some llm assisted blog posts about a few of the experiments.

They are here: sjsteiner.substack.com

What are you using those embeddings for, If you don't mind me asking? I'd love to know more about the workflow and what the prefix instructions are like.
It's junk compared to BGE M3 on my retrieval tasks