Hacker News new | ask | show | jobs
by shahjaidev 542 days ago
The community would benefit a lot from a multilingual ModernBERT. Pretraining on a multilingual corpus is crucial for a ranking/retrieval model to be deployed in many industry settings.Simply extending the vocab and fine tuning the en checkpoint won’t quite work. Any plans to release a multilingual checkpoint ?