|
|
|
|
|
by solaire_oa
361 days ago
|
|
I also think that structured outputs are criminally underused, but it isn't perfect... and per your example, it might not even be good, because I've done something similar. I was trying to make a decent cocktail recipe database, and scraped the text of cocktails from about 1400 webpages. Note that this was just the text of the cocktail recipe, and cocktail recipes are comparatively small. I sent the text to an LLM for JSON structuring, and the LLM routinely miscategorized liquor types. It also failed to normalize measurements with explicit instructions and the temperature set to zero. I gave up. |
|
the idea is that instead of using JSON.parse, we create a custom Type.parse for each type you define.
so if you want a:
And the LLM happens to output: We can upcast "Amazon" -> ["Amazon"] since you indicated that in your schema.https://www.boundaryml.com/blog/schema-aligned-parsing
and since its only post processing, the technique will work on every model :)
for example, on BFCL benchmarks, we got SAP + GPT3.5 to beat out GPT4o ( https://www.boundaryml.com/blog/sota-function-calling )