|
|
|
|
|
by geysersam
1201 days ago
|
|
Another answer in the thread said this: > I'm pretty sure there's a mistake here: https://github.com/cocktailpeanut/dalai/blob/main/index.js#L... , there's a ${suffix} missing > It causes the quantization to process to always use the first part of the model if using a larger size than 7B. I don't even know what this stuff does, but I see the ggml-model-f16.bin files have ggml-model-f16.bin.X as well in the folder, so I'm pretty sure this is a mistake. Maybe it's causing the loss of accuracy? Perhaps that's the issue? |
|