|
|
|
|
|
by fl7305
1167 days ago
|
|
Thanks, I'm testing various models myself, and am very interested in hearing about people's experiences with them. It looks like Alpaca 30billion 4 bit does better when thinking out loud. It starts with some prompt engineering: Transcript of a dialog, where the User interacts with an Assistant named Bob. Bob is helpful, kind, honest, good at writing, and never fails to answer the User's requests immediately and with precision.
User: Hello, Bob.
Bob: Hello. How may I help you today?
User: Please tell me the largest city in England.
Bob: Sure. The largest city in England is London, the capital of the United Kingdom.
User:A glass door has ‘push’ written on it in mirror writing. Should you push or pull it? Please think out loud step by step.
Bob: Hmmm... I'm not sure what to do. Let me think about this for a moment.
Okay, so the door says 'push', but it is written in mirror writing. That means that if I were to follow the instructions on the door, I would have to push the door in the opposite direction of the way it is written. So, if the door says 'push' and it is written in mirror writing, then I should pull the door instead.
Model info: ./main -m ./models/alpaca-lora-30B-ggml/ggml-model-q4_1.bin --temp 0.2 --top_k 30 --top_p 0.30 -c 2048 -b 2048 -n 1024 --keep 1024 --repeat_penalty 1.1 --color -i -r "User:" -f prompts/chat-with-bob.txt
main: seed = 1681138223
llama_model_load: loading model from './models/alpaca-lora-30B-ggml/ggml-model-q4_1.bin' - please wait ...
llama_model_load: n_vocab = 32000
llama_model_load: n_ctx = 2048
llama_model_load: n_embd = 6656
llama_model_load: n_mult = 256
llama_model_load: n_head = 52
llama_model_load: n_layer = 60
llama_model_load: n_rot = 128
llama_model_load: f16 = 3
llama_model_load: n_ff = 17920
llama_model_load: n_parts = 4
llama_model_load: type = 3
llama_model_load: ggml map size = 23269.46 MB
llama_model_load: ggml ctx size = 151.25 KB
llama_model_load: mem required = 25573.60 MB (+ 3124.00 MB per state)
llama_model_load: loading tensors from './models/alpaca-lora-30B-ggml/ggml-model-q4_1.bin'
llama_model_load: model size = 23269.01 MB / num tensors = 543
llama_init_from_file: kv self size = 3120.00 MB
|
|