Hacker News new | ask | show | jobs
by threeseed 664 days ago
I still don't understand how multi-step LLM based AI agents work.

If the probability of an LLM making a mistake = 5% and you have 10 steps then the accuracy of the overall workflow is 60%. Which is useless. Even if we have major advancements in the performance of LLMs and it drops to 1% then still the overall workflow is 90% which is poor.

So what is the plan here ? There is a limit to how many tasks in businesses can tolerate so much inaccuracy.

1 comments

Let's say the task is automatically aggregating customer support info.

First step is collecting incoming emails

Second step is summarizing each one

Third step is batching by issue/severity

Notice how there is tolerance for deviance/error. "An error" looks like coding a ticket red instead of yellow, or slightly misrepresenting what a client said. The overall workflow can still be net positive.