| I have tried a lot of local models. I have 656GB of them on my computer so I have experience with a diverse array of LLMs. Gemma has been nothing to write home about and has been disappointing every single time I have used it. Models that are worth writing home about are; EXAONE-3.5-7.8B-Instruct - It was excellent at taking podcast transcriptions and generating show notes and summaries. Rocinante-12B-v2i - Fun for stories and D&D Qwen2.5-Coder-14B-Instruct - Good for simple coding tasks OpenThinker-7B - Good and fast reasoning The Deepseek destills - Able to handle more complex task while still being fast DeepHermes-3-Llama-3-8B - A really good vLLM Medical-Llama3-v2 - Very interesting but be careful Plus more but not Gemma. |
One of the downsides of open models is that there are a gazillion little parameters at inference time (sampling strategy, prompt template, etc.) that can easily impair a model's performance. It takes some time for the community to iron out the wrinkles.