|
|
|
|
|
by lerela
1193 days ago
|
|
The FP16 7B version runs on my Ubuntu XPS with 32GB memory, ~300ms/token. 13B also works but results aren't really good (the model will loop after a few sentences) so parameters probably need tuning. So far I'm unable to reliably generate outputs in a different language than English, the model will very quickly start to translate (even if it's not asked) or just switch to English. |
|