|
|
|
|
|
by chocolatkey
1202 days ago
|
|
I'm pretty sure there's a mistake here: https://github.com/cocktailpeanut/dalai/blob/main/index.js#L... , there's a ${suffix} missing It causes the quantization to process to always use the first part of the model if using a larger size than 7B. I don't even know what this stuff does, but I see the ggml-model-f16.bin files have ggml-model-f16.bin.X as well in the folder, so I'm pretty sure this is a mistake. Maybe it's causing the loss of accuracy? |
|