Hacker News new | ask | show | jobs
by fellatio 368 days ago
Any model can provide perfect JSON according to a schema if you discard non-conforming logits.

I imagine that validation as you go could slow things down though.

2 comments

The technical term is constrained decoding. OpenAI has had this for almost a year now. They say it requires generating some artifacts to do efficiently, which slows down the first response but can be cached.
Expect this is a problem pattern that will be seen a lot with LLMs.

Do I look at whether the data format is easily output by my target LLM?

Or do I just validate clamp/discard non-conforming output?

Always using the latter seems pretty inefficient.