|
|
|
|
|
by threeducks
165 days ago
|
|
I have tried a few Qwen-2.5 and 3.0 models (<=30B), even abliterated ones, but it seems that some words have been completely wiped from their pretraining dataset. No amount of prompting can bring back what has never been there. For comparison, I have also tried the smaller Mistral models, which have a much more complete vocabulary, but their writing sometimes lacks continuity. I have not tried the larger models due to lack of VRAM. |
|