|
|
|
|
|
by aazo11
386 days ago
|
|
I tested on-device LMs (Gemma, DeepSeek) across prompt cleanup, PII redaction, math, and general knowledge on my M2 Max laptop using LM Studio + DSPy. Some observations - Gemma-3 is the best model for on-device inference
- 1B models look fine at first but break under benchmarking
- 4B can handle simple rewriting and PII redaction. It also did math reasoning surprisingly well.
- General knowledge Q&A does not work with a local model. This might work with a RAG pipeline or additional tools I plan on training and fine-tuning 1B models to see if I can build high accuracy task specific models under 1GB in the future. |
|