Hacker News new | ask | show | jobs
by refulgentis 462 days ago
IMHO the biggest factor holding that back is how rushed and distanced these model releases are, still.

Both Phi-4-mini and Gemma 3 were released recently. Phi-4's damn close to a good, real, model release. Microsoft's done a great job of iterating.

Gemma 3's an excellent, intelligent, model, but it's got a gaping blind spot: tool-calling / JSON output. There was a vague quick handwave about it in some PR, a PM/eng on the Gemma team commented here in response to someone else that TL;DR "it's supported in Ollama!", which is Not Even Wrong, i.e. in the Pauli sense of the phrase.

- Ollama uses a weak, out of date llama.cpp thing where the output tokens are constrained to match a JSON schema. This falls apart almost immediately, i.e. as soon as there is more than one tool.

- The thing that matters isn't whether we can constrain output tokens, any model can do that, I've had Llama 3 1B making tool calls that way. The thing that matters is A) did you train that in and B) if you did, tell us the format

All that to say, IMHO we're still 6 months to a year out from BigCo understanding enough about their own stuff to even have a good base for it. Sure, tool calling and fine-tuning are orthogonal, in a sense, but in practice, if I'm interested in getting a specific type of output, odds are I wanted that formatted a specific way.

2 comments

Gemma3 1B seems to be able to choose which tool to use for very simple cases, if you constrain using anyOf, and narrow it down to just a few with RAG first.

It can't understand numbers very well though, "one thousand five" might become "1500".

JSON constraints seem to make them unable to figure it out even if they'd normally get it every time.

Maybe it's different with models above 4B though.

could one train now a gemma 3 fine tune for tool use?

found this on HF https://huggingface.co/ZySec-AI/gemma-3-27b-tools