|
|
|
|
|
by accrual
341 days ago
|
|
I wondered similar. Perhaps a local model cached in a 16GB or 24GB graphics card would perform well too. It would have to be a quantized/distilled model, but maybe sufficient, especially with some additional training as you mentioned. |
|
https://huggingface.co/unsloth/Qwen3-0.6B-unsloth-bnb-4bit