Hacker News new | ask | show | jobs
by cointegrated 90 days ago
This project is absolutely NLLB 2.0 in spirit. However, we decided to reserve the name “OMT-NLLB” only to the subset of the new models that have encoder-decoder architecture similar to the original NLLB-200. The other models are called “OMT-LLaMA” and have classical LLM architecture. The idea here (and we had to emphasize it to justify the project internally) is that we are developing not just new models but a recipe for massive multilinguality that can be integrated into general-purpose LLMs.