|
|
|
|
|
by ac29
557 days ago
|
|
The model you are running isnt the one used in the benchmarks you link. The default llama3.3 model in ollama is heavily quantized (~4 bit). Running the full fp16 model, or even an 8-bit quant wouldnt be possible on your laptop with 64G RAM. |
|