Hacker News new | ask | show | jobs
by chrisheecho 416 days ago
Hey there, I’m Chris Cho (x: chrischo_pm, Vertex PM focusing on DevEx) and Ivan Nardini (x: ivnardini, DevRel). We heard you and let us answer your questions directly as possible.

First of all, thank you for your sentiment for our latest 2.5 Gemini model. We are so glad that you find the models useful! We really appreciate this thread and everyone for the feedback on Gemini/Vertex

We read through all your comments. And YES, – clearly, we've got some friction in the DevEx. This stuff is super valuable, helps me to prioritize. Our goal is to listen, gather your insights, offer clarity, and point to potential solutions or workarounds.

I’m going to respond to some of the comments given here directly on the thread

8 comments

Had to move away from Gemini because the SDK just didn't work.

Regardless of if I passed a role or not, the function would say something to the effect of "invalid role, accepted are user and model".

Tried switching to openAI compatible SDK, it threw errors for tool call calls and I just gave up.

Could you confirm if it was a known bug that was fixed?

You don't have to specify role when you call through Python (https://cloud.google.com/vertex-ai/generative-ai/docs/start/...)

(which I think is what you are using but maybe i'm wrong).

Feel free to DM me on @chrischo_pm on X. Stuff that you are describing shouldn't happen

Can we avoid weekend changes to the API? I know it's all non-GA, but having `includeThoughts` suddenly work at ~10AM UTC on a Sunday and the raw thoughts being returned after they were removed is nice, but disruptive.
Can you tell me the exact instance when this happened please? I will take this feedback back to my colleagues. But in order to change how we behave I need a baseline and data
Thoughts used to be available in the Gemini/Vertex APIs when Gemini 2.0 Flash Thinking Experimental was initially introduced [1][2], and subsequently disabled to the public (I assume hidden behind a visibility flag) shortly after DeepSeek R1's release [3] regardless of the `include_thoughts` setting.

At ~10:15AM UTC 04 May, a change was rolled out to the Vertex API (but not the Gemini API) that caused the API to respect the `include_thoughts` setting and return the thoughts. For consumers that don't handle the thoughts correctly and had specified `include_thoughts = true`, the thinking traces then leaked into responses.

[1]: https://googleapis.github.io/python-genai/genai.html#genai.t...

[2]: https://ai.google.dev/api/generate-content#ThinkingConfig

[3]: https://github.com/googleapis/python-genai/blob/157b16b8df40...

Can you ask whoever owns dashboards to make it so I can troubleshoot quota exceeded errors like this? https://x.com/spyced/status/1917635135840858157
We are working on fixing this and showing the critical ones in AIS. I agree it is crazy there is 700+ items here. Real pain in the neck to deal with.
I love that you're responding on HN, thanks for that! While you're here I don't suppose you can tell me when Gemini 2.5 Pro is hitting European regions on Vertex? My org forbids me from using it until then.
Yeah, not having clear time lines for new releases on the one hand, but being quick with deprecation of older models isn't a very good experience.
Thanks for replying, and I can safely say that most of us just want first-class conformity with OpenAI's API without JSON schema weirdness (not using refs, for instance) baked in.
Or returning null for null values, not some "undefined" string.

Or not failing when passing `additionalProperties: false`

Or..

Hi, one thing I am really struggling with in AI studio API is stop_sequences. I know how to request them, but cannot see how to determine which stop_sequence was triggered. They don't show up in the stop_reason like most other APIs. Is that something which vertex API can do? I've built some automation tools around stop_sequences, using them for control logic, but I can't use Gemini as the controller without a lot of brittle parsing logic.
Thank you feedback noted
Is there an undocumented hardcoded timeout for Gemini responses even in streaming mode? JSON output according to a schema can get quite lengthy, and I can't seem to get all of it for some inputs because Gemini seemingly terminates requests
This is probably just you hitting the model's internal output length maximum. Its 65,536 tokens for 2.5 pro and flash.

For other models, see this link and open up the collapsed section for your specific model: https://ai.google.dev/gemini-api/docs/models

Thanks! It might just be that!
This is so cringe.

I hope it doesn't become a trend on this site.

A team taking the opportunity to engage directly with their users to understand their feedback so they can improve the product? So cringe.
Google usually doesn't care what users say at all. This is why they so often have product-crippling bugs and missing features. At least this guy is making a show of trying before he transfers to another project.
It’s the US style, which has made its way across the pond too: you have to make upbeat noises to remove any suspicion you’re criticizing.
Unlike others ... you got it.

It is incredibily lame for a gargantuan company like Google and their thousands of developers and PMs and this and that ... to come to a remote corner of the web to pretend they are doing what they should have done 10 years ago.

Google should have cleaned up its Gemini API 10 years ago?
>Chat, briefly, what does a PM at a company like Google do?

"A Product Manager (PM) at Google is responsible for guiding the development of products from conception to launch. They identify user needs, define product vision and strategy, prioritize features, work with cross-functional teams (engineering, design, marketing), and ensure the product aligns with business goals. They act as the bridge between technical teams and stakeholders to deliver successful, user-focused solutions."

Some might have ignored your question, but in the spirit of good conversation, I figured I’d share a quick explanation of what a PM does, just in case it helps!

This sounds accurate. I see myself as a Pain Manager more than a Product manager. Product just solves the pain that users have ;)

Sometimes we get it right the first time we launch it, I think most of the time we get it right over a period of time.

Trying to do a little bit better everyday and ship as fast as possible!