|
|
|
|
|
by nostrebored
11 days ago
|
|
But why does your agent control doneness? It seems to me the most odd part to delegate. All LLMs are terrible at it. Most LLM tasks can be expressed as a DAG or DAG of DAGs. Why delegate that to a random point in context instead of enforcing the flow? |
|
And it gets delegated to context because it’s either to have another session and tell it to double check and critique the first LLM than it is to write a deterministic test for every prompt. Like if I want a new form that sends a REST request on submit, I can have two LLMs duking it out in 5 minutes. If I have to write Selenium tests then I might as well just write the feature. Or I can have an LLM write the tests, but that’s more or less the same as letting a second LLM judge the first.