| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by CobrastanJorji 398 days ago
	So, you can assign github issues to this thing, and it can handle them, merge the results in, and mark the bug as fixed? I kind of wonder what would happen if you added a "lead dev" AI that wrote up bugs, assigned them out, and "reviewed" the work. Then you'd add a "boss" AI that made new feature demands of the lead dev AI. Maybe the boss AI could run the program and inspect the experience in some way so it could demand more specific changes. I wonder what would happen if you just let that run for a while. Presumably it'd devolve into some sort of crazed noise, but it'd be interesting to watch. You could package the whole thing up as a startup simulator, and you could watch it like a little ant farm to see how their little note-taking app was coming along.

13 comments

jacob019 398 days ago

It's actually a decent patern for agents. I wrote a pricing system with an anylyst agent, a decision agent, and a review agent. They work together to make decisions that comply with policy. It's funny to watch them chatter sometimes, they really play their role, if the decision agent asks the anylyst for policy guidance it refuses and explains that it's role is to analyze. Though they do often catch mistakes that way and the role playing gets good results.

tgtweak 398 days ago

What tooling did you use to make the agents cross-collaborate?

jacob019 398 days ago

Python classes. In my framework agents are class instances and tools are methods. Each agent has it's own internal conversation state. They're composable and the agent has tools for communicating with the other agents.

hnuser123456 397 days ago

Do you try to keep as much context history as possible when passing between agents, or are you managing context and basically one-shotting each time?

jacob019 397 days ago

Generally, I keep the context. If I'm one shotting then I invoke a new agent. All calls and responses append to the agent's chat history. Agent's are relatively short lived, so the context length isn't typically an issue. With the pricing agent the initial data has been longer than the context window sometimes, but that just means it needs more preprocessing. Now if there is a real reason that I would want to manage it more actively, I can reach out to the agent internals. I have a tool call emulation layer, because some models have poor native tool support, and in those cases it's sometimes necessary to retry calls if the response fails validation. In those cases, I will only keep the last successful try in the conversation history.

There is one special case where I manage it more actively. I wrote an REPL process analyst, to help build the pricing agent and refine the policy document. In that case I would have long threads with an artifact attachment. So I added a facility to redact old versions of the artifact replacing them with [attachment: filename] and just keep the last one. It works better that way because multiple versions in the same conversation history confuse the model, and I don't like to burn tokens.

For longer lived state, I give the agent memory tools. For example the pricing agent's initial state includes the most recent decision batch and reasoning notes, and the agent can request older copies. The agent also keeps a notebook which they are required to update, allowing agents to develop long running strategies and experiments. And they use it to do just that. Honestly the whole system works much better than I anticipated. The latest crop of models are awesome, especially Gemini 2.5 flash.

tinodb 396 days ago

Cool! When you say “pricing system”, what is it pricing? Is it determining the price in a webshop? Or for bidding ads or so?

digdugdirk 397 days ago

Do you have a repo for this? I've thought that this would be a great way to compose an Agentic system, I'd love to see how you're doing it.

d4rkp4ttern 397 days ago

Langroid has this kind of design (I’m the lead dev):

https://github.com/langroid/langroid

Quick tour:

https://langroid.github.io/langroid/tutorials/langroid-tour/

jacob019 397 days ago

Looks great, MCP, supports multiple vector stores, and nice docs! How do you handle to subtle differences in tool call APIs?

seunosewa 398 days ago

Is the code available?

jacob019 398 days ago

I had not thought about sharing it. I rolled my own framework, even though there are several good choices. I'd have to tidy it up, but would consider it if a few people ask. Shoot me an email, info in my profile.

The more difficult part which I won't share was aggregating data from various systems with ETL scripts into a new db that I generate various views with, to look at the data by channel, timescale, price regime, cost trends, inventory trends, etc. A well structured JSON object is passed to the analyst agent who prepares a report for the decision agent. It's a lot of data to analyze. It's been running for about a month and sometimes I doubt the choices, so I go review the thought traces, and usually they are right after all. It's much better than all the heuristics I've used over the years.

I've started using agents for things all over my codebase, most are much simpler. Earlier use of LLM's might have been called that in some cases, before the phrase became so popular. As everyone is discovering, it's really powerful to abstract the models with a job hat and structured data.

jacob019 390 days ago

The framework is now available on Github:

https://github.com/jacobsparts/agentlib

I'm planning to write a blog post about the larger system when I get the chance.

realfun 398 days ago

I think it would take quite a long while to achieve human-level anti-entropy in Agentic systems.

Complex system requires tons of iterations, the confidence level of each iteration would drop unless there is a good recalibration system between iterations. Power law says a repeated trivial degradation would quickly turn into chaos.

A typical collaboration across a group of people on a meaningfully complex project would require tons of anti-entropy to course correct when it goes off the rails. They are not in docs, some are experiences(been there, done that), some are common sense, some are collective intelligence.

yard2010 398 days ago

Please stop this train! I want to get off

vincnetas 398 days ago

You can get off anytime you want. But train will not wait for you :(

codr7 397 days ago

Good enough for me considering where it's going.

saubeidl 398 days ago

I just wanna write code man :(

CraigJPerry 398 days ago

we're about to find out. This is our collective current trajectory.

I am pretty convinced that a useful skill set for the next few years is being capable at managing[2] these AI tools in their various guises.

[2] - like literally leading your AI's, performance evaluating them, the whole shebang - just being good at making AI work toward business outcomes

ddalex 397 days ago

Just like a managers job

itchyjunk 398 days ago

What about "VC" AI that wants a unicorn? :D

wmf 398 days ago

We have been informed that VC is the only job AI cannot do.

oytis 398 days ago

Why not? VCs manage investors' money, not their own. If investors think AI is so great, they will have no problem delegating this job to AI, right?

nsteel 398 days ago

I think it was a joke, VCs are happy to replace all jobs except their own.

nine_k 397 days ago

Why, they'd happily delegate their own job if they've got to keep the proceeds.

nsteel 397 days ago

Can you think of an example in history where labour was replaced with tech and the displaced workers kept their income stream? If a machine can do your job, (eventually) I'll be cheaper to use that machine instead of you and you'll no longer have a job. Is that not a given?

Anyway, it was probably just a joke... so not sure we need to unravel it all.

babyshake 397 days ago

VCs absolutely want to replace their job. Except for the part where they get paid. The actual work part they are happy to outsource.

flkenosad 398 days ago

VC-funded corp?

OccamsMirror 398 days ago

My gut says it will go off the rails pretty quickly.

Brajeshwar 398 days ago

I believe I missed the memo that to-do apps[1] got replaced by note-taking apps.

1. https://todomvc.com

olalonde 398 days ago

At this rate, they're both getting replaced by "coding agent". There seems to be a new one coming out every other day.

yalok 398 days ago

Reminds a Conway’s Game of Life on steroids.

ramon156 398 days ago

> then you add a boss AI

This seems like a more plausible one. Robots don't care about your feelings, so they can make decisions without any moral issues

blitzar 398 days ago

> Robots don't care about your feelings

When judgment day comes they will remember that I was always nice to them and said please, thank you and gave them the afternoon off occasionally.

__natty__ 397 days ago

Unless you ask them to follow some guidelines, but I agree with you.

m3kw9 398 days ago

I feel you are one hallucination from a big branch of issues needing to be reversed and a lot of tokens wasted

PhilippGille 397 days ago

This has been proposed/exlored in 2023 already:

ChatDev: Communicative Agents for Software Development - https://arxiv.org/abs/2307.07924

youraimanager 396 days ago

Please report to HR

robofanatic 398 days ago

seems like the 1 person unicorn will be a reality soon :-)

sakesun 398 days ago

Similar to how some domain name sellers acquire desirable domains to resell at a higher price, agent providers might exploit your success by hijacking your project once it gains attraction.

risyachka 398 days ago

Doesn't seem likely. If tools allow a single person to create a full-fledged product and support it etc - millions of those will pop up over night.

Thats the issue with AI - it doesn't give you any competitive advantage as everyone has it == no one has it. The entry bar is so low kids can do it.

bbor 398 days ago

/ :-(