|
|
|
|
|
by lhl
1165 days ago
|
|
Ah thanks a lot, I tried out the llama.cpp examples before the k-shot chat prompts, this is definitely much better! I have a 5950X as well, but sadly, token generation is a bit too slow for me now. (I've had turbo turned off for efficiency as well, but maybe I'll see if the extra cycles helps). I'm giving 30B a try on my GPU now with https://github.com/oobabooga/text-generation-webui/wiki/LLaM... and if it's not good then will give layer offloading with 65B a try and see if I can get it running well. |
|