Hacker News new | ask | show | jobs
by elbear 72 days ago
In case you don't know, Gemini 2.5 flash is hosted on DeepInfra. They also have 1.5 flash but not 2.0 flash.

I have no affiliation with DeepInfra. I use them, because they host open-source models that are good.

1 comments

Thanks. Yeah, for now we're moving to 3.1 flash lite as that's the new cheapest at $.25/1M and is also still "good enough". 2.5 flash is more expensive at $.30/1M (looks like Deep Infra charges the same as GCP/VertexAI for it). I might check them out for Gemma though. We benchmarked Gemma2 when that came out and it wasn't remotely usable for us largely because the context window was way too small. It looks like 3 or 4 might be worth evaluating though.
Xiaomi's mimo-v2-flash is great if you care about speed and performance - it's 1/10 the price of Gemini 3.1 Flash Lite and faster (on OpenRouter).

GCP does server other non-Google models, but I'm not sure what they have other than Anthropic models. I don't think Haiku is a great model though.