Hacker News new | ask | show | jobs
by mchusma 8 days ago
I think its even more puzzling because you can't even run Gemma 31b on google cloud, they only let you test it with a rate limit. No way (I can find) to actually pay them to use it.

We saw great results in our usecase using google direct. Moved to Openrouter because google wouldn't let us use it beyond a test.

Then Openrouters performance looked worse, not sure if there was a quantized version or something. So we instead looked at Deepseek v4 Flash, and opted to go for that.

This model would probably be great for a super low cost cloud model, would love to use it in the cloud, Google makes you go elsewhere.

1 comments

I'm using it for one of my use cases (ocr) on openrouter right now.
It’s on openrouter. We just noticed performance was worse in a specific agentic app usecase. It’s possible we made an implementation mistake, my main point though is Google is really silly not hosting their own models.
I tested Gemma 4 31b for OCR and it's very good at it. This makes sense because I also get the best OCR results from Gemini compared to Claude or ChatGPT in my use case.