Hacker News new | ask | show | jobs
by neodypsis 540 days ago
How does it compare to Jina V3 [0], which also has 8192 context length?

0. https://arxiv.org/abs/2409.10173

1 comments

They perform different roles, so they're not directly comparable.

Jina V3 is an embedding model, so it's a base model, further fine-tuned specifically for embedding-ish tasks (retrieval, similarity...). This is what we call "downstream" models/applications.

ModernBERT is a base model & architecture. It's not supposed to be out of the box, but fine-tuned for other use-cases, serving as their backbone. In theory (and, given early signal, most likely in practice too), it'll make for really good downstream embeddings once people build on top of it!