Hacker News new | ask | show | jobs
by snowcrash123 1101 days ago
I agree on the excerpt on agents. Reliability and reproducibility of task completion is the biggest problem for agents to cross the chasm to real life use cases. When agents are given an objective, they think everything from first principles or scratch about next best action to complete the objective and agent trajectory ends up becoming more of a linguistic dance. But we are solving some of the agent specific problems at SuperAGI https://github.com/TransformerOptimus/SuperAGI ( disclaimer : Im creator of it ) by doing agent trajectory fine tuning using recursive instructions. Think about objective as telling agent to go from A to B and instructions are akin to giving it directions about the route. And this instruction can be self created after every run and fed into subsequent runs to improve the trajectory.

Other problem with agent is : most independent agents are capable of doing very thin slice of use case, but for complex knowledge work tasks, more often than not, one agent is not enough. You need a team of agents. We introduced a concept of Agent Clusters - which operate in master slave architecture and coordinating among themselves to complete nuanced tasks and coordinating via shared memory and shared task list.

Another big bottleneck I think is lack of a notional concept of Knowledge for Agents. We have LTM and STM, but knowledge is specialized understanding of particular class of objectives ( ecommerce customer support, Account based marketing, medical diagnostics for particular condition etc ) plugged into the agent. Currently agents leverage on the knowledge available in the LLMs. LLMs are great for intelligence, but not necessarily knowledge required for an objective. So we added concept of knowledge - which is a embedding plugged into agent apart from LTM / STM

There lot of other challenges that need to solved like agent performance monitoring, agent specific models, agent to agent communication etc to truly solve for agents deployed in production. Not sure about point mentioned in the article that they might even take over the entire stack because autonomous agentic behaviour is good for certain use cases and not for all kinds of apps.