|
|
|
|
|
by roadside_picnic
2 hours ago
|
|
See my comment to parent. I've been using local LLMs for practical, personal tasks for a few months now very successfuly. You can run fantastic local models if you have either: - M-series Apple device with ideally >= 24GB of VRAM - RTX [345]090 GPU I'm fortunate enough to have both and use an M-series laptop as basically a persistent server (I don't use it much and when traveling typically just use my work laptop). My desktop doesn't act as a persitent server but I fire up llama.cpp on it all time for quick chat sessions. If you have one of the above devices and can dedicate it as server there are additional layers of tooling you can use that dramatically improve the experience. In particular Open WebUI allows you to add tons of useful tools (image gen, web search, code eval, etc), and agent harnesses like Hermes can make the current gen small models very capable. I have an agent in chat on my phone that basically handles all the sys-admin for the server it runs on. |
|