Happy to chat internally if you want, feel free to reach out.
I see a lot of people swearing by one model, but without trying others. I see a lot of opinions based on a snapshot of tooling from ~January, when for example Claude Code was exceptional, but that don't appear to have been updated. In blind tests the models appear to be much closer than some folks would have you believe.
Google models are well known for being quite terse and efficient on cost – reasonably low pricing for what they are and reasonably low token use for what they achieve.
But as I said do reach out if you are actually a googler, as my points are really about the internal tech which I am pretty positive about.
I see a lot of people swearing by one model, but without trying others. I see a lot of opinions based on a snapshot of tooling from ~January, when for example Claude Code was exceptional, but that don't appear to have been updated. In blind tests the models appear to be much closer than some folks would have you believe.