yeah, but gemma models are shit, objectively speaking. I've been building edge AI tools for my lab group, and gemma lineage hallucinates so much it cannot be used at all. We're almost exclusively using qwen models. Given no image with a prompt for OCR to a gemma model, it'll make things up even if told to null fields not present. Qwen will a) follow instructions (placing things not clearly legible into a dedicated 'notes' field), return null for missing items, and get some pretty wild OCR tasks done quite well. It's almost got the opposite problem, I've had to limit how quickly people can submit an OCR ingested label to the DB because people started trusting it to never make mistakes, while gemma required correction on nearly every scan because of things it made up. So gemma didn't win on tok/s, accuracy, grounding of answers, etc. Theres no conceivable spot where it wins, and this was for qwen-vl:4b vs gemma e4b, so qwen model is 1/3 of the size, runs faster, and is far more reliable. So what does gemma really bring to the table?
It's not the cheapest, its worse than cheaper options, etc. All it really brings is the google label.
I mean, yeah, if it doesn't work at all nobody will use it. But the big players are spending $1k/mo.+ on AI at this point. That's obviously out of reach for many.