I'm not sure I understand, in the docs for the python client it says that BAML types get converted to Pydantic models, doesn't that step include the extra latency you mentioned?
My bad, I think I didnt explain correctly. Basically you have two options when a "," is missing (amongst other issues) in an LLM output which causes a parsing issue:
- retry the request, which may take 30+ secs (if your LLM outputs are really long and you're using something like gpt4)
- fix the parsing issue
In our library we do the latter. The conversion from BAML types to Pydantic ones is a compile-time step unrelated to the problem above. That doesn't happen at runtime.
tb = TypeBuilder()
tb.Person.add_property("last_name", tb.string().list())
tb.Person.add_property("height", tb.float().optional()).description(
"Height in meters"
)
tb.Hobby.add_value("chess")
for name, val in tb.Hobby.list_values():
val.alias(name.lower())
tb.Person.add_property("hobbies", tb.Hobby.type().list()).description(
"Some suggested hobbies they might be good at"
)
# no_tb_res = await b.ExtractPeople("My name is Harrison. My hair is black and I'm 6 feet tall.")
tb_res = await b.ExtractPeople(
"My name is Harrison. My hair is black and I'm 6 feet tall. I'm pretty good around the hoop.",
{"tb": tb},
)
assert len(tb_res) > 0, "Expected non-empty result but got empty."
for r in tb_res:
print(r.model_dump())
Neat, thanks! I'm still pondering wether I should be using this since most of the retries I have to do are because of the LLM itself not understanding the schema asked for (eg output with missing fields / using a value not present in `Literal[]`) — certain models being especially bad with deeply nested schemas and output gibberish. Anything on your end that can help with that?
or if you're open to share your prompt / data model with, I can send over my best guess of a good prompt! We've found these models works even with over 50+ fields / nested and whatnot decently well!
I might share it with you later on your discord server.
> I can send over my best guess of a good prompt!
Now if you could automate the above process by "fitting" a first draft prompt to a wanted schema, ie where your library makes a few adjustments if some assertions do not pass by have having a chat of its own with the LLM, that would be super useful! Heck i might just implement it myself.
- retry the request, which may take 30+ secs (if your LLM outputs are really long and you're using something like gpt4)
- fix the parsing issue
In our library we do the latter. The conversion from BAML types to Pydantic ones is a compile-time step unrelated to the problem above. That doesn't happen at runtime.