|
|
|
|
|
by senko
426 days ago
|
|
I'm doing that with a 12GB card, ollama supports it out of the box. For some reason, it only uses around 7GB of VRAM, probably due to how the layers are scheduled, maybe I could tweak something there, but didn't bother just for testing. Obviously, perf depends on CPU, GPU and RAM, but on my machine (3060 + i5-13500) it's around 2 t/s. |
|