Hacker News new | ask | show | jobs
by daakus 990 days ago
It can! TheBloke is to thank for the incredibly quick turnaround.

https://github.com/ggerganov/llama.cpp/pull/3362

https://huggingface.co/TheBloke/Mistral-7B-v0.1-GGUF/tree/ma...

3 comments

Birds fly, sun shines, and TheBloke always delivers.

Though I can't figure out that prompt and with LLama2's template it's... weird. Responds half in Korean and does unnecessary numbering of paragraphs.

Just one big sigh towards those supposed efforts on prompt template standardization. Every single model just has to do something unique that breaks all compatibility but has never resulted in any performance gain.

I used the prompt included in llama.cpp and it worked for me in English (for fun GK type questions):

MODEL=./models/mistral-7b-v0.1.Q5_K_M.gguf N_THREAD=16 ./examples/chat-13B.sh

I have yet to get any useful output out of the Q5_K_S version; haven't tried any others yet.
Linked is the base model. What you want is the instruct model (also on TheBloke's profile), which has been trained on following instructions.
I used mistral-7b-v0.1.Q5_K_M.gguf and it responded to basic questions.
Wow, awesome!