| This really captures something I've been experiencing with Gemini lately. The models are genuinely capable when they work properly, but there's this persistent truncation issue that makes them unreliable in practice. I've been running into it consistently, responses that just stop mid-sentence, not because of token limits or content filters, but what appears to be a bug in how the model signals completion. It's been documented on their GitHub and dev forums for months as a P2 issue. The frustrating part is that when you compare a complete Gemini response to Claude or GPT-4, the quality is often quite good. But reliability matters more than peak performance. I'd rather work with a model that consistently delivers complete (if slightly less brilliant) responses than one that gives me half-thoughts I have to constantly prompt to continue. It's a shame because Google clearly has the underlying tech. But until they fix these basic conversation flow issues, Gemini will keep feeling broken compared to the competition, regardless of how it performs on benchmarks. https://github.com/googleapis/js-genai/issues/707 https://discuss.ai.google.dev/t/gemini-2-5-pro-incomplete-re... |
If you want to use application/json as the specified output in the request, you can’t use tools
So if you need both, you either hope it gives you correct json when using tools (which many times it doesn’t). Or you have to do two requests, one for the tool calling, another for formatting
At least, even if annoying, this issue is pretty straightforward to get around