|
|
|
|
|
by MacsHeadroom
1151 days ago
|
|
The best option right now is SuperCOT 30B, a LoRA for LLaMA trained on Chain-of-Thought coding questions and answers. You will need at least 24GB of VRAM to load the 4bit GPTQ quantized LoRA merged model* locally. https://huggingface.co/tsumeone/llama-30b-supercot-4bit-128g... *The merged model contains the LoRA. Applying the LoRA over a "raw" llama model uses more VRAM and does not allow for full context in 24GB of VRAM. |
|