|
|
|
|
|
by technovangelist
1041 days ago
|
|
I am using ollama today on a MacBook Pro M1Max with 64GB. Using a llama2 70b model, I am getting about 7 tokens/second with the onboard gpu. Before ollama used gpu, that was much slower. To compare, the 7b model gets me closer to 55 tokens/second. There is no way it could achieve those numbers without the gpu. |
|