|
|
|
|
|
by XMasterrrr
507 days ago
|
|
I think I know what he means. I use AI Chat. I load Qwen2.5-1.5B-Instruct with llama.cpp server, fully offloaded to the CPU, and then I config AI Chat to connect to the llama.cpp endpoint. Checkout the demo they have below https://github.com/sigoden/aichat#shell-assistant |
|