Hacker News new | ask | show | jobs
by SanderNL 1190 days ago
I'm taking the liberty to spread my most recent words of visionary wisdom here. (/s)

One of my main issues with these guys is their context window. Their memory. It's hard to see a LLM working on a code-base a few thousand tokens at a time and still being precise about it. To do that you need summary techniques. Feeding prompt with incrementally compressed summaries and hoping it will maintain cohesion.

That sounds a lot like trying to let the CEO of a company do all the grunt work by feeding him summaries. "Mr Gates, here's a 2 paragraph summary of our codebase. Should we name the class AnalogyWidgetProducer or FactoryWidgetAnalogyReporter?"

I don't think that's going to work.

My gut feeling is that what we call corporations are actually already a form of AI, but running on meat. I saw someone call Coca Cola a "paper clip maximizer", obviously for drinks instead of paper clips, but it actually - kind of - is. FWIW, I'm having a hard time thinking of it as anything else. Who controls it? What is it anyway?

CEOs have the same context window problem, which to my knowledge is mainly solved through delegation. The army might be another example. Generals, officers, privates. How do you expect a general to make sensible statements about nitty-gritty operational details? It is not possible, but that does not mean the system as-a-whole cannot make progress towards a goal.

Maybe we need to treat LLMs like employees inside a company (which in its totality is the AI, not the individual agents). If we have unfettered access to low-cost LLMs this might be easier to experiment with.

I'm thinking like spinning up an LLM for every "class" or even every "method" in your codebase and letting it be a representative of that and only that piece of code. You can even call it George and let it join in on meetings to talk about it. George needs some "management" too, so there you go. Soon you'll have a veritable army of systems ready to talk about your code from their point-of-view. Black box the son of a gun and you're done. Clippy 2.0. My body is ready.

1 comments

Yes, there are two key things here I think.

1 - we don't hold everything in working memory. We don't even hold everything in our heads, we store things elsewhere. We then learn/have ways of bringing relevant information to the fore.

2 - we have roles that we take on.

The hierarchy/collaboration of differently prompted roles gives rise to a lot more depth. I already had this with a two LLM conversation about planning (one planner and one plan critic), drove out much more detailed actionable plans.

With the information hierarchy, for code you'd probably want something like:

High level goal summary/product description. Lower level summary about the area you're looking at. API docs of linked components. Full code of the class you're altering.

That's roughly what I have in mind I guess when working on a problem.

I think what we call "role-play" might be more integral to intelligence than we tend to give it credit for. Now I think of it, a "job description" could be a good prompt.

If you start with a CEO-like job agent, that can think of what other jobs are necessary then you can bootstrap from there. "I want to produce and sell red bread" => "We are going to need a bakery, accountant, marketeer, etc." and then those are "companies" of sorts with their own CEO that can think of how to solve their particular sub-problems.

I think your comparison to a company is a really good mental model of a larger more capable collaborative structure.

You can even have "hiring" and "firing" where it's deciding to create or remove roles.

I think so too. I see room for different types of AI having a seat in this "collaborative structure" as you say. I think I'm going to call companies that from now on by the way. Some AIs can specialize in "prompting" and pump out "workers" of varying effectiveness which indeed can be "hired" and "fired" as whatever performance metrics change.

I can see how more expensive and capable AIs get closer to the "executive seat" and lesser AIs - like what we now call GPTs - doing the grunt work. Interacting with humans and such, which is of course beneath the more powerful ones.

Using text - and thus providing a vehicle for the concepts it encodes - is brilliant. It enables cross-cutting communication between systems that otherwise have very little to do with each other. (GPT<->Wolfram) As programmers we have a first-row seat on the code=data front. We are trained to see how text is able to be converted into action. Something I find most regular people are having trouble even visualizing. ("It's just text")

I guess we were on to something when we as humans started to talk to each other..