|
|
|
|
|
by qudat
1041 days ago
|
|
I just wanted to call out that some of these quick-to-start tools are CPU only (eg ollama) which is great to play with but if you want your GPU you’ve gotta go to llama.cpp Further, the 70B for llama.cpp is still under development as far as I know. |
|
Ollama on macOS will use both the GPU and the Accelerator framework. It's build with the (amazing) llama.cpp project.
To run the 70B model you can try:
Note you'll most likely need a Mac with 64GB of shared memory and there's still a bit of work to do to make sure 70B works like a charm