| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by misterdata 1144 days ago
	As they are accepting a JSON schema for the function calls, it is likely they are using token biasing based on the schema (using some kind of state machine that follows along with the tokens and only allows the next token to be a valid one given the grammar/schema). I have successfully implemented this for JSON Schema (limited subset) on llama.cpp. See also e.g. this implementation: https://github.com/1rgs/jsonformer

1 comments

newhouseb 1144 days ago

As someone also building constrained decoders against JSON [1], I was hopeful to see the same but I note the following from their documentation:

  The model can choose to call a function; if so, the content will be a stringified JSON object adhering to your custom schema (note: the model may generate invalid JSON or hallucinate parameters).

So sadly, it is just fine tuning. There's no hard biasing applied :(. You were so close, but so far OpenAI!

[1] https://github.com/newhouseb/clownfish

[2] https://platform.openai.com/docs/guides/gpt/function-calling

link

jumploops 1144 days ago

They may have just fine-tuned 3.5 to respond with valid JSON more times than not.

Building magic functions[0] I ran into many examples where JSONSchema broke for gpt-3.5-turbo but worked well for gpt-4.

[0] https://github.com/jumploops/magic

link

civilitty 1144 days ago

Or there’s a trade off between more complex schemas and logit bias going off the rails since there’s probably little to no backtracking.

link

newhouseb 1143 days ago

Good point. Backtracking is certainly possible but it is probably tricky to parallelize at scale if you're trying to coalesce and slam through a bunch of concurrent (unrelated) requests with minimal pre-emption.

link