Hacker News new | ask | show | jobs
by IanCal 1190 days ago
I'm increasingly convinced you can build an agi system with gpt4.

People are trying to get it to solve everything up front but I've had GPT3 do much better by taking it through a problem asking it questions. Then I realised it was good at asking those questions too so just hooked it up to talk to itself with different roles. Gpt4 seems much better overall and is very good at using tools if you just tell it how and what it has available.

With a better setup than reAct, better memory storage and recall, I think it'd be an agi. I'm not hugely convinced it isn't anyway - it's better than most people at most tasks I've thrown at it.

Oh, and gpt came up with better roles for the "voices in the head" than I did too.

2 comments

I agree. There is something special about layering these guys. To me this is like we are looking at a static combustion engine without the vehicle. “How is this useful?”

It’s that I’m not sure what the best approach is here. Waiting for other smarter folks to put the pieces together.

I'm taking the liberty to spread my most recent words of visionary wisdom here. (/s)

One of my main issues with these guys is their context window. Their memory. It's hard to see a LLM working on a code-base a few thousand tokens at a time and still being precise about it. To do that you need summary techniques. Feeding prompt with incrementally compressed summaries and hoping it will maintain cohesion.

That sounds a lot like trying to let the CEO of a company do all the grunt work by feeding him summaries. "Mr Gates, here's a 2 paragraph summary of our codebase. Should we name the class AnalogyWidgetProducer or FactoryWidgetAnalogyReporter?"

I don't think that's going to work.

My gut feeling is that what we call corporations are actually already a form of AI, but running on meat. I saw someone call Coca Cola a "paper clip maximizer", obviously for drinks instead of paper clips, but it actually - kind of - is. FWIW, I'm having a hard time thinking of it as anything else. Who controls it? What is it anyway?

CEOs have the same context window problem, which to my knowledge is mainly solved through delegation. The army might be another example. Generals, officers, privates. How do you expect a general to make sensible statements about nitty-gritty operational details? It is not possible, but that does not mean the system as-a-whole cannot make progress towards a goal.

Maybe we need to treat LLMs like employees inside a company (which in its totality is the AI, not the individual agents). If we have unfettered access to low-cost LLMs this might be easier to experiment with.

I'm thinking like spinning up an LLM for every "class" or even every "method" in your codebase and letting it be a representative of that and only that piece of code. You can even call it George and let it join in on meetings to talk about it. George needs some "management" too, so there you go. Soon you'll have a veritable army of systems ready to talk about your code from their point-of-view. Black box the son of a gun and you're done. Clippy 2.0. My body is ready.

Yes, there are two key things here I think.

1 - we don't hold everything in working memory. We don't even hold everything in our heads, we store things elsewhere. We then learn/have ways of bringing relevant information to the fore.

2 - we have roles that we take on.

The hierarchy/collaboration of differently prompted roles gives rise to a lot more depth. I already had this with a two LLM conversation about planning (one planner and one plan critic), drove out much more detailed actionable plans.

With the information hierarchy, for code you'd probably want something like:

High level goal summary/product description. Lower level summary about the area you're looking at. API docs of linked components. Full code of the class you're altering.

That's roughly what I have in mind I guess when working on a problem.

I think what we call "role-play" might be more integral to intelligence than we tend to give it credit for. Now I think of it, a "job description" could be a good prompt.

If you start with a CEO-like job agent, that can think of what other jobs are necessary then you can bootstrap from there. "I want to produce and sell red bread" => "We are going to need a bakery, accountant, marketeer, etc." and then those are "companies" of sorts with their own CEO that can think of how to solve their particular sub-problems.

I think your comparison to a company is a really good mental model of a larger more capable collaborative structure.

You can even have "hiring" and "firing" where it's deciding to create or remove roles.

I think so too. I see room for different types of AI having a seat in this "collaborative structure" as you say. I think I'm going to call companies that from now on by the way. Some AIs can specialize in "prompting" and pump out "workers" of varying effectiveness which indeed can be "hired" and "fired" as whatever performance metrics change.

I can see how more expensive and capable AIs get closer to the "executive seat" and lesser AIs - like what we now call GPTs - doing the grunt work. Interacting with humans and such, which is of course beneath the more powerful ones.

Using text - and thus providing a vehicle for the concepts it encodes - is brilliant. It enables cross-cutting communication between systems that otherwise have very little to do with each other. (GPT<->Wolfram) As programmers we have a first-row seat on the code=data front. We are trained to see how text is able to be converted into action. Something I find most regular people are having trouble even visualizing. ("It's just text")

I guess we were on to something when we as humans started to talk to each other..

I am surprised by how many, even among the tech community, wholesale disregard GPT as a glorified auto-complete, or "a statistical model on human information".

What, then, is the human brain if not a trained statistical model? Granted it is considerably more sophisticated in some ways, but in many other ways it is less sophisticated and less capable.

I wonder if the same reaction would have happened if ChatGPT had waited and released with GPT-4. It's very different.