|
|
|
|
|
by eyegor
929 days ago
|
|
Any 7b can run well (~50 tok/s) on an 8gb gpu if you tune the context size. 13b can sometimes run well but typically you'll end up with a tiny context window or slow inference. For cpu, I wouldn't recommend going above 1.3b unless you don't mind waiting around. |
|