Hacker News new | ask | show | jobs
Jais – the world’s most advanced Arabic large language model
4 points by cs-fan-101 1026 days ago
Cerebras, G42's Inception, and MBZUAI are pleased to announce Jais, the world’s best-performing Arabic LLM. Jais is a 13B parameter model that was trained on a new 395 billion token Arabic-English-Code dataset. Jais brings the power of Generative AI to 400m Arabic speakers across 25 nations.

Jais highlights:

- State-of-the-art 13-billion-parameter bilingual Arabic-English model

- Trained on a new data set including 116 billion Arabic tokens incorporating books, Wikipedia, and machine translation from English. Also trained on 279 billion English/code tokens

- Bidirectional transfer learning: Arabic improved because of the English tokens, and English improved because of the Arabic tokens

- Open source and available for download on Hugging Face

To learn more, check out the following:

- Press Release: https://www.cerebras.net/press-release/meet-jais-the-worlds-most-advanced-arabic-large-language-model-open-sourced-by-g42s-inception

- Read the technical paper: https://www.inceptioniai.org/jais/docs/Technicalpaper.pdf

- Models on Hugging Face: https://huggingface.co/inception-mbzuai