Hacker News new | ask | show | jobs
by sanxiyn 616 days ago
Jamba 1.5 Large is 398B params (94B active) and weights are available.

https://arxiv.org/abs/2408.12570

2 comments

Thanks for the link. The benchmark results aren't too impressive for its size but it likely hasn't been trained as thoroughly as llama (I couldn't find the training size in the paper but I doubt they have access to as much compute as Meta) so it still feels encouraging that it doesn't look ridiculous either.
Not as much as meta, no. But AI21 labs is partnered with Amazon and did a ~$200M funding round last year IIRC so still plenty of funds for training big models
Thanks, missed that one.

For context gpt-4 is supposedly @ 1.8T params.