I give myself 6-18 months before I think top-performing LLM's can do 80% of the day-to-day issues I'm assigned.
> Why doesn’t anyone acknowledge loops like this?
Thisis something you run into early-on using LLM's and learn to sidestep. This looping is a sort of "context-rot" -- the agent has the problem statement as part of it's input, and then a series of incorrect solutions.
Now what you've got is a junk-soup where the original problem is buried somewhere in the pile.
Best approach I've found is to start a fresh conversation with the original problem statement and any improvements/negative reinforcements you've gotten out of the LLM tacked on.
I typically have ChatGPT 5 Thinking, Claude 4.1 Opus, Grok 4, and Gemini 2.5 Pro all churning on the same question at once and then copy-pasting relevant improvements across each.
I concur. Something to keep in mind is that it is often more robust to pull an LLM towards the right place than to push it away from the wrong place (or more specifically, the active parts of its latent space). Sidenote: also kind of true for humans.
That means that positively worded instructions ("do x") work better than negative ones ("don't do y"). The more concepts that you don't want it to use / consider show up in the context, the more they do still tend to pull the response towards them even with explicit negation/'avoid' instructions.
I think this is why clearing all the crap from the context save for perhaps a summarizing negative instruction does help a lot.
> positively worded instructions ("do x") work better than negative ones ("don't do y")
I've noticed this.
I saw someone on Twitter put it eloquently: something about how, just like little kids, the moment you say "DON'T DO XYZ" all they can think about is "XYZ..."
> That means that positively worded instructions ("do x") work better than negative ones ("don't do y").
In teacher school, we're told to always give kids affirmative instructions, ie "walk" instead of "don't run". The idea is that it takes more energy for a child to figure out what to do.
> This looping is a sort of "context-rot" -- the agent has the problem statement as part of it's input, and then a series of incorrect solutions.
While I agree, and also use your work around, I think it stands to reason this shouldn't be a problem. The context had the original problem statement along with several examples of what not to do and yet it keeps repeating those very things instead of coming up with a different solution. No human would keep trying one of the solutions included in the context that are marked as not valid.
I'm sure somewhere in the current labs there are teams that are trying to figure out context pruning and compression.
In theory you should be able to get a multiplicative effect on context window size by consolidating context into it's most distilled form.
30,000 tokens of wheel spinning to get the model back on track consolidated to 500 tokens of "We tried A, and it didn't work because XYZ, so avoid A" and kept in recent context
I agree it shouldn't be a problem, but if you don't regularly run into humans who insist on trying solutions clearly signposted as wrong or not valid, you're far luckier than I am.
> I give myself 6-18 months before I think top-performing LLM's can do 80% of the day-to-day issues I'm assigned.
This is going to age like "full self driving cars in 5 years". Yeah it'll gain capabilities, maybe it does do 80% of the work, but it still can't really drive itself, so it ultimately won't replace you like people are predicting. The money train assures that AGI/FSD will always be 6-18 months away, despite no clear path to solving glaring, perennial problems like the article points out.
> The money train assures that AGI/FSD will always be 6-18 months away
I vividly remember when some folks from Microsoft come to my school to give a talk at some Computer Science event and proclaimed that yep, we have working AGI, the only limiting factor is hardware, but that should be resolved in about ten years.
But still under pressure in the short-term, no? As companies lean into AI as a means of efficiency / competitive advantage / cost savings, jobs will be eliminated / reduced while companies find their direction. The potential gains are said to be too big to sit on the sidelines and wait to be a late-adopter.
Yes hold onto your job like your life depends on it because after this bubble pops the job market will get even worse. Then you need to hold on through the trough until experienced engineers are valued again once all of the AI waste flushes out of the system
Honestly when I speak about these sorts of issues I get the feeling that other people view me as some kind of luddite, especially people above me who presumably want to replace as many people with AI as possible. I suppose me pointing out the flaws breaks the illusion of magic that people want AI to have.
> I suppose me pointing out the flaws breaks the illusion of magic that people want AI to have.
My impression is rather: there exist two kinds of people who are "very invested in this illusion":
1. People who want to get rich by either investing in or working on AI-adjacent topics. They of course have an interest to uphold this illusion of magic.
2. People who have a leftist agenda ("we will soon all be replaced by AI, so politics has to implement [leftist policy measures like UBI]"). If people realize that AI is not so powerful, after all, such leftist political measures whose urgency was argued with the (hypothetical) huge societal changes that will be caused by AI will not have a lot backing in society, or at least not considered to be urgently implemented by society.
The left is generally extremely sceptical to UBI, as its main proponents tend to be classically liberal groups (so not "US liberal") pushing it as a means to contain and limit welfare systems by dropping welfare programs in favour of a general, low UBI.
The more leftist position ever since the days of Marx has been that "right rather than being equal would have to be unqueal" to be equitable given that people have different needs, to paraphrase from Critique of the Gotha Program - UBI is in direct contradiction to socialist ideals of fairness.
The people I see pushing UBI, on the contrary, usually seems motivated either by the classically liberal position of using it to minimise the state, or driven by a fear of threats to the stability of capitalism. Saving capitalism from perceived threats to itself isn't a particularly leftist position.
I agree with your first point but regarding your second: I’m as far left as it gets and I don’t think that’s true at all. Most of the influencers I follow despise AI and also are highly skeptical of the outrageous claims made by Sam Altman etc. The reality is that the need for things like universal health care exists today. Tens of millions of people can not get medical care in the US. Insurance companies are allowed to deny claims with no justification. That has nothing to do with AI taking jobs BUT it does involve AI because United Health’s denial rate went through the roof right after they started letting AI determine which claims were covered by policy with no human review. So people on the left are talking about AI in contexts that it doesn’t seem you’re aware of
Because it's easy to learn to stop engaging with those loops, treating them as a sign you provided too little context, and instead start a new conversation with an expanded prompt.
It doesn't mean these loops aren't an issue, because they are, but once you stop engaging with them and cut them off, they're a nuisance rather than a showstopper.
They happen in subtle ways that aren't always easy and are rarely early in a project I want to just throw away.
"So what if you have to throw out a week's worth of work. That's how these things work. Accept it and you'll be happier. I have and I'm happy. Don't you see that it's OK to have your tool corrupt your work half way through. It's the future of work and you're being left behind by not letting your tools corrupt your work arbitrarily. Just start over like a real man."
The AI-fanbois will quickly tell you that you are misusing the context or your prompt is "wrong".
But I've had it consistently happens to me on tiny contexts (e.g. I've had to spend time trying - and failing - to get it to fix a mess it was making with a straightforward 200-ish line bash script).
And its also very frequently happened to me when I've been very careful with my prompts (e.g. explicitly telling it to use a specific version of a specific library ... and it goes and ignores me completely and picks some random library).
This is just the new version of "works on my machine". Oh, I was able to contrive a correct answer from my prompt because the random number generator smiled upon me today.
Now what you've got is a junk-soup where the original problem is buried somewhere in the pile.
Best approach I've found is to start a fresh conversation with the original problem statement and any improvements/negative reinforcements you've gotten out of the LLM tacked on.
I typically have ChatGPT 5 Thinking, Claude 4.1 Opus, Grok 4, and Gemini 2.5 Pro all churning on the same question at once and then copy-pasting relevant improvements across each.