|
|
|
|
|
by GordonS
1189 days ago
|
|
Nice, main.exe seems to work just fine with the 7B quantized model - generates a token every 400ms on an AMD Ryzen 5 2600! But, quantize.exe doesn't seem to work - any valid command (such as below) pauses for a split second, then returns with no output? $ quantize.exe ggml-model-f16.bin ggml-model-q4_0.bin 2 |
|