Hacker News new | ask | show | jobs
by practice9 1139 days ago
Models replicating LLaMA are cool, but they are all missing proper multilingual support, which GPT-3.5 is quite good at.
2 comments

IMHO multilingual support would just pollute precious available estate in those models. Why not use it in english and use another one for translation?
That would work if all information is available in English as the primary language. That's not the case though. You may be missing out on interesting information if you're skipping other languages.
It depends on your use.

LLaMA’s main issue is that its license prevents commercial use.

If you want to use a LLM inside of a product, you may need to internationalize it at some point, so multilingual support matters.

Llama 65B is actually quite decent in other languages. I can just barely fit it in memory though with my 128 gb ram. Usually I run the 8 bit quantized version that use 80, but even the 4 and 3 but are ok compared to the fp16 30B version.