Hacker News new | ask | show | jobs
by pcwelder 361 days ago
```

try:

    answer = chain.invoke(question)

    # print(answer) # raw JSON output

    display_answer(answer)
except Exception as e:

    print(f"An error occurred: {e}")

    chain_no_parser = prompt | llm

    raw_output = chain_no_parser.invoke(question)

    print(f"Raw output:\n\n{raw_output}")
```

Wait, are you calling LLM again if parsing fails just to get what LLM has sent to you already?

The whole thing is not difficult to do if you directly call API without Lang chain, it'd also help you avoid such inefficiency.

1 comments

I don't get the langchain hate, but I agree that this "blog post" is bad.

Langchain has a way to return raw output, aside "with structured output": https://python.langchain.com/docs/how_to/structured_output/#...

It's pretty common to use a cheaper model to fix these errors to match the schema if it fails with a tool call.

> It's pretty common to use a cheaper model to fix these errors to match the schema if it fails with a tool call.

This has not be true for a while.

For open models there's 0 need for these kind of hacks with libraries like Xgrammar and Outlines (and several others) both existing as a solution on their own and being used by a wide range of open source tools to ensure structured generation happens at the logit levels. There's no-need to add multiples to your inference cost, when in some cases (xgrammar) they can reduce inference cost.

For proprietary models more and more providers are using proper structured generation (i.e. constrained decoding) under-the-hood. Most notably OpenAI's current version of structure outputs makes use of logit based methods to guarantee the structure of the output.