| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by b2v 728 days ago
	I'm not sure I understand, in the docs for the python client it says that BAML types get converted to Pydantic models, doesn't that step include the extra latency you mentioned?

1 comments

aaronvg 728 days ago

My bad, I think I didnt explain correctly. Basically you have two options when a "," is missing (amongst other issues) in an LLM output which causes a parsing issue:

- retry the request, which may take 30+ secs (if your LLM outputs are really long and you're using something like gpt4)

- fix the parsing issue

In our library we do the latter. The conversion from BAML types to Pydantic ones is a compile-time step unrelated to the problem above. That doesn't happen at runtime.

link

b2v 728 days ago

Thanks for the clarification. How do you handle dynamic types, ie types determined at runtime?

link

hellovai 728 days ago

we recently added dynamic type support with this snippet! (docs coming soon!)

Python: https://github.com/BoundaryML/baml/blob/413fdf12a0c8c1ebb75c...

Typescript: https://github.com/BoundaryML/baml/blob/413fdf12a0c8c1ebb75c...

Snippet:

async def test_dynamic():

    tb = TypeBuilder()

    tb.Person.add_property("last_name", tb.string().list())

    tb.Person.add_property("height", tb.float().optional()).description(
        "Height in meters"
    )


    tb.Hobby.add_value("chess")

    for name, val in tb.Hobby.list_values():
        val.alias(name.lower())

    tb.Person.add_property("hobbies", tb.Hobby.type().list()).description(
        "Some suggested hobbies they might be good at"
    )

    # no_tb_res = await b.ExtractPeople("My name is Harrison. My hair is black and I'm 6 feet tall.")
    tb_res = await b.ExtractPeople(
        "My name is Harrison. My hair is black and I'm 6 feet tall. I'm pretty good around the hoop.",
        {"tb": tb},
    )

    assert len(tb_res) > 0, "Expected non-empty result but got empty."

    for r in tb_res:
        print(r.model_dump())

link

b2v 728 days ago

Neat, thanks! I'm still pondering wether I should be using this since most of the retries I have to do are because of the LLM itself not understanding the schema asked for (eg output with missing fields / using a value not present in `Literal[]`) — certain models being especially bad with deeply nested schemas and output gibberish. Anything on your end that can help with that?

link

hellovai 728 days ago

nothing specific, but you can try our prompt / datamodel out on https://www.promptfiddle.com

or if you're open to share your prompt / data model with, I can send over my best guess of a good prompt! We've found these models works even with over 50+ fields / nested and whatnot decently well!

link

b2v 728 days ago

I might share it with you later on your discord server.

> I can send over my best guess of a good prompt!

Now if you could automate the above process by "fitting" a first draft prompt to a wanted schema, ie where your library makes a few adjustments if some assertions do not pass by have having a chat of its own with the LLM, that would be super useful! Heck i might just implement it myself.

link