| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by smallnamespace 763 days ago
	This might be a dumb question, but did you ever try having it introspect into its own execution log, or perhaps a summary of its log? I also have a tendency to get side tracked and the only remedy was to force myself to occasionally pause what I'm doing and then reflect, usually during a long walk.

1 comments

swax 763 days ago

Yea, there's some logs here https://test.naisys.org/logs/

Inter-agent tasks is a fun one. Sometimes it works out, but a lot of the time they just end up going back and forth talking, expanding the scope endlessly, scheduling 'meetings' that will never happen, etc..

A lot of AI 'agent systems' right now add a ton of scaffolding to corral the AI towards success. The scaffolding is inversely proportional to the sophistication of the model. GPT-3 needs a ton, Opus needs a lot less.

Real autonomous AI you should just be able to give a command prompt and a task and it can do the rest. Managing it's own notes, tasks, goals, reports, etc.. Just like if any of us were given a command shell and task to complete.

Personally I think it's just a matter of the right training. I'm not sure if any of these AI benchmarks focus on autonomy, but if they did maybe the models would be better at autonomous tasks.

link

khimaros 762 days ago

> Inter-agent tasks is a fun one. Sometimes it works out, but a lot of the time they just end up going back and forth talking, expanding the scope endlessly, scheduling 'meetings' that will never happen, etc..

sounds like "a straight shooter with upper management written all over it"

link

swax 762 days ago

Sometimes I'll tell two agents very explicitly to share the work, "you work on this, the other should work on that." And one of the agents ends up delegating all their work to the other, constantly asking for updates, coming up with more dumb ideas to pile on to the other agent who doesn't have time to do anything productive given the flood of requests.

What we should do is train AI on self-help books like the '7 habits of highly productive people'. Let's see how many paperclips we get out of that.

link

nerdponx 762 days ago

I suspect it's a matter of context: one or both agents forget that they're supposed to be delegating. ChatGPT's "memory" system for example is a workaround, but even then it loses track of details in long chats.

link

swax 762 days ago

Opus seems to be much better at that. Probably why it’s so much more expensive. AI companies have to balance costs. I wonder if the public has even seen the most powerful, full fidelity models, or if they are too expensive to run.

link

nerdponx 761 days ago

Right, but this is also a core limitation in the transformer architecture. You only have very short-term memory (context) and very long-term memory (fixed parameters). Real minds have a lot more flexibility in how they store and connect pieces of information. I suspect that further progress towards something AGI-like might require more "layers" of knowledge than just those two.

When I read a book, for example, I do not keep all of it in my short-term working memory, but I also don't entirely forget what I read at the beginning by the time I get to the end: it's something in between. More layered forms of memory would probably allow us to return to smaller context windows.

link