Hacker News new | ask | show | jobs
by bitexploder 21 days ago
There is a great thing. Because the agents can do so much toil you can add things like formal verification, fuzzing, and other feedback mechanisms and quality gates to your projects cheaply. In a human written project you still needed those things, but it cost a lot. Agents require these quality gates and they can implement them for you. The problem with AI documentation is it will just write a lot of useless bullshit unless you guide it on what is important. You can also get agents to identify transitive dependencies via testing and other things.

I adopt the mindset of docs are for humans, tests are for agents. They document formal dependencies and leave a measurable artifact behind. If you identify some behavior or transitive dep in your system, agents document it first with a test codifying the expected behavior. Tests are the source of truth about expected system behavior and you can convince agents to write decent behavioral tests if you ask them to with the right structure. Docs are now cheap and a render, not a long term thing. There is some token efficiency to consider, but still, they are quick and cheap if you don't understand some module or its purpose.

2 comments

Yeah "plus one" to this. Static analysis, fuzzing, linting, integration tests -- there are all sorts of very useful artifacts which have been around for a long time, but which are very time consuming to implement and then maintain. LLMs shift the economics around producing and maintaining these tremendously, so we can now afford these robust validation mechanisms.

These serve as living documentation which cries out in pain when they get out of sync with the system in question, generating specific error messages -- as opposed to natural language docs which rapidly drift into an ambiguous "kinda useful" state. And the validation is performed mechanically (as opposed to neurally) so no hallucinations are possible.

The one thing I would add is that you do want these artifacts to be human-friendly from a reading perspective -- you want engineers to be able to scan over these and check that they are validating the right things.

> Because the agents can do so much toil you can add things like formal verification, fuzzing, and other feedback mechanisms and quality gates to your projects cheaply

Works great until they sweep you a test under the rug which always passes because the condition is something like if(true) .

That was my point. Validating actual behavioral tests. Not letting them cheat. They still will at times, but like, resd their code, fix it or send a reviewer agent to find and make todo list. If you give them a behavioral test skill it will do a much better job. Sometimes I have to hint to them. I rarely ship anything I have not reviewed at least once.
> Not letting them cheat. They still will at times, but like, resd their code,

Well then, if they "still will", your effort kind of misses the point. Sure maybe, you'll catch it every time and maybe that one time you did not catch it, it was no critical mistake...But it only needs to make that critical mistake once, and all of this effort was in vain.

(as an outsider) what this sounds a lot like to me is trying to manage a very large team of human personnel that have a high turnover rate which is not directly in your control.

Some of them will make mistakes, some of them will cheat, some of them will do things you don't like, and "punishing" them will be less helpful to you due to the high turnover than building a system which instead disincentivizes things from a high level. Which catches bad actions and starts them over.

Classically I think we are more accustomed to "building a team of humans, and being able to chastize or fire a bad employee helps the team grow more cohesive and build accountability".

But it is possible to get the same (less than ideal) situation with teams of humans where accountability cannot be easily instilled into the team as we have with teams of agents.

And then obviously the reason one might consider using such an unusual and difficult to manage team as a tool is when the cost is low and the supply is high, which is purportedly the case with AI at least for the moment.

Right, you design systems resilient to this in traditional software engineering as well. Agents are just... a little more chaotic at times :-D