| HN Mirror

> It's pretty common to use a cheaper model to fix these errors to match the schema if it fails with a tool call.

This has not be true for a while.

For open models there's 0 need for these kind of hacks with libraries like Xgrammar and Outlines (and several others) both existing as a solution on their own and being used by a wide range of open source tools to ensure structured generation happens at the logit levels. There's no-need to add multiples to your inference cost, when in some cases (xgrammar) they can reduce inference cost.

For proprietary models more and more providers are using proper structured generation (i.e. constrained decoding) under-the-hood. Most notably OpenAI's current version of structure outputs makes use of logit based methods to guarantee the structure of the output.