| For me, a very simple "breakdown tasks into a queue and store in a DB" solution has help tremendously with most requests. Instead of trying to do everything into a single chat or chain, add steps to ask the LLM to break down the next tasks, with context, and store that into SQLite or something. Then start new chats/chains on each of those tasks. Then just loop them back into LLM. I find that long chats or chains just confuse most models and we start seeing gibberish. Right now I'm favoring something like: "We're going to do task {task}. The current situation and context is {context}. Break down what individual steps we need to perform to achieve {goal} and output these steps with their necessary context as {standard_task_json}. If the output is already enough to satisfy {goal}, just output the result as text." I find that leaving everything to LLM in a sequence is not as effective as using LLM to break things down and having a DB and code logic to support the development of more complex outcomes. |
Also mentioning what to "forget" or not focus on anymore seems to remove some noise from the responses if they are large.