Y
Hacker News
new
|
ask
|
show
|
jobs
by
francisduvivier
966 days ago
I also had the same issue, in my case it was because I was trying to use a llama 2 model. When trying with codellama
https://huggingface.co/TheBloke/CodeLlama-7B-GGUF/tree/main
, which is based on the first llama, it works.
1 comments
francisduvivier
965 days ago
Correction: looks like it has to do with the quantization rather: 8bit quantization works while less does not not seem to work. Other working model example (no conversion needed):
https://huggingface.co/TheBloke//Yarn-Mistral-7B-64k-GGUF/ya...
link