Hacker News new | ask | show | jobs
by Kiro 572 days ago
Devin is real. What do you mean?

Anyway, this is pretty standard stuff already. In all my agent workflows the agents are able to write their own code and execute it before passing the result to the next agent. It doesn't need to be perfect since you always have an agent validating the results, sending the task back if necessary.

I haven't read the paper beyond the synopsis so I might be missing a crucial key takeaway and I presume it has a lot of additional layers.

1 comments

As evidenced by the reaction to Devin, no, it’s not real.

There’s a limit, beyond which agent generated code is, in general, not reliable.

All of the people who claim otherwise (like the Devin videos) have shown to be fake (1) or cherry-picked.

Having agent generated code is arbitrary code to solve arbitrary problems is. Not. A. Solved. Problem.

Yet.

…no matter, no matter how many AI bros claim otherwise, currently.

Being able to decompose complex problems into part small enough to be able to be solved by current models would be a big deal if it was real.

(Because, currently the SoTA can’t reliably do this; this should not be a remotely controversial claim to people familiar with this space)

So tldr; extraordinary claims require extraordinary evidence. Which is absent here, as far as I can tell. They specifically call out in the paper that generated actions are overly specific and don’t always work; but as I said, it’s doing well on the leader board, so it’s clearly doing something, which is working, but there’s just noooooo way of seeing what.

[1] - https://www.zeniteq.com/blog/devins-demo-as-the-first-ai-sof...