Hacker News new | ask | show | jobs
by wmal 606 days ago
I wanted to find the actual change performed by these agents so I watched the embedded video. I can not believe what I saw.

The video shows a private fork of a pubic repository. The bug is real, but it was resolved in February 2023 and doesn’t seem like the solution was automated [1]

The bug has a stack trace attached with a big arrow pointing to line 223 of a backend_compat.py file. A quick grasp on this stack trace and you already know what happened and why, and how to fix this, but…

not for the agent. It seems to analyze the repository in multiple steps and tries to locate the class. Why did they even release this video?

[1] https://github.com/Qiskit/qiskit/issues/9562

6 comments

Mgmt at every company is asked - what are you doing to be agentic ?

so, they organize hackathons where devs build a hypothetical agentic framework nobody will dare use. So, mgmt can claim, look here what i have done to be agentic.

you should ask: would you dogfood your agent, and the answer is no way. these are meant purely for marketing purposes, as they dont meet an end user need.

whats hilarious in this farce is how these are being rebranded from "co-pilots" to "agents"

just goes to show, it is all a big song-and-dance. much ado about nothing.

The term "co-pilot" implies a company has to hire a software engineer to guide the AI.

The term "agent" implies you can give the AI full access to your repos and fire the software engineers you're grudgingly paying six figures to.

The second is much more valuable to executives not wanting to pay the software people that demand higher salaries than virtually everyone else in the organization.

They're was no rebrand. They're different concepts. Copilot and similar solutions are giving hints as you do the development. Agents are systems that receive a goal and will iterate actions and queries for more information until they achieve the goal.
you are quoting the party-line.

i am saying, the thing is snake-oil - a solution looking for a problem.

I'm explaining what words mean. Agentic approach has been a thing for years https://en.wikipedia.org/wiki/Intelligent_agent You can just say you don't like AI in programming, without saying incorrect things on top of that.
Right. Woe is the startup that doesn't have an AI story right now.
The companies that have a data moat and no AI are in a much better position than those who’ve got it the other way around.
Depends on what you are optimizing for.

Long term value, I agree.

Fundraising, hard disagree.

Classic machine learning researcher trick: just select your test example from the training set! It certainly saves a lot of effort.
That’s true, but this repo has thousands of bugs. They could at least find one that was in the training set, but also did not contain the location in the bug description.

This way it would at least look like it may work

Decision makers and those writing the check aren’t sophisticated enough to know the difference, in my experience with orgs that buy from IBM.
every hype cycle runs through a predictable course.

we are at a phase where the early adopters have seen the writing on the wall.. ie that llms are useful for a limited set of usecases. but there are lots of late adopters who are still awestruck and not disillusioned yet.

Indeed. It's also amusing how it produces a multi-page essay on the bug instead of submitting a pull request with an actionable fix.
The demo is not supposed to wow the technical people. The business people whose budgets will pay for this are less likely to notice.
I think the process could be better, but if you want good quality you really shouldn't expect it to just jump at the "obvious" thing. Just like you wouldn't want the developer to just make the error to away in the quickest way. Getting more context is always going to be a good idea, even if it wastes some time in the "trivial" cases.
it takes more time to watch the video than fix the bug
you can't expect all at once. just one step forward. note how fast everything moves since 2020, and accelerating. finally 'it's' coming...