|
|
|
|
|
by sourcecodeplz
35 days ago
|
|
Running LLMs local is fun and powerful but if you want to get work done... it is a big headache. You have to pre-plan and plan, and make specs, etc... The big OpenAI, Claude models just get you with just a few sentences.. |
|
If you're already doing big boy stuff with big boy models, then... just carry on trucking!
Only place I'd differ is for vision/OCR tasks. Small/medium open weights models are as good as SoTa, and token prices for prefill are kinda very not worth it for larger batch tasks.
Other thing that people forget is, if you want to have even a smallish LLM as a reliable personal service, you've got to carve out 16-24 of (V)RAM and leave it permanently running.