Hacker News new | ask | show | jobs
by extr 775 days ago
I've noticed the same issue with AI coding, where you start to write requirements and then realize that you yourself don't have a perfect idea of what exactly this feature should be, or how it should work. It's easy to say the answer should be to simply think harder, or enter a dialogue with the AI about missing details, but if you try that you'll find yourself supplying an enormous amount of context you didn't expect to have to communicate. Context not even directly related to the code at hand, but about the broader business or industry, past lessons learned, something the CEO said to you last week about the feature, etc.

It's this kind of thing that makes me think tackling big feature requests is still an AGI-complete problem. Perhaps if it gets good enough at pure coding you can iterate your way to success.

9 comments

> but if you try that you'll find yourself supplying an enormous amount of context you didn't expect to have to communicate. Context not even directly related to the code at hand, but about the broader business or industry, past lessons learned, something the CEO said to you last week about the feature, etc.

Basically you go from programmer to product manager, except you also get to micromanage a non-sentient programmer

What prevents an AI agent from becoming the product manager as well, and communicating with you (the customer) to clarify requirements?
A failure mode of product managers is to just pass customer requests to the developers.

I don't see an AI agent doing a good job of avoiding that.

The tenor of the conversation I imagine since it’s a chatbot
Slavery laws, presumably.
I don't know if we're talking about exactly the same thing but this is my side story:

Even small requests to AI I find myself accidentally including some words or phrases that seem to indicate to AI "Oh he wants this as a function that does all the things very manually".

So I get some fairly capable, but very verbose and often inflexible code.

Yet, that's not what I was asking for, but something in the context set the AI off in that direction. In reality I'm not sure what I want and I'm open to anything.

Often I suddenly realize "Wait, there's gotta be some built in things in this language that does this or part of this..." and often there is that is far more reliable and a better way to do it. Somehow AI skipped that and gave me a different answer.

It strikes me as similar to customers who come to me with "I want an email that's sent on Tuesdays that are single digit calendar dates and this field contains the letter Q in them and ..." But when I ask them what they're trying to accomplish I find all that specificity isn't needed, and they really mean they order all their grapes on Tuesdays at the begging of the month and they just want a list of their grapes orders every few weeks.

Yeah this is a similar phenomenon. AI is not so good at recognizing that you're looking for the "general" solution to the problem, one that will holistically fit in with the rest of the codebase/objective, and what has been provided as an example is really just a special case.

I think part of the problem is that instruction fine tuning is not done on full codebases, just shorter problems that fit into reasonable (8K, 32K) context windows. By nature these problems are more specific, so they are biased in that direction from the start.

And that is why the I in AI is still misleading.
> then realize that you yourself don't have a perfect idea of what exactly this feature should be

I talked about it the last time that Copilot Workspaces reached the front page two days ago and that was, I don't think the value is in the code generation, but rather in the ability to capture our thought process. CW is currently a bottleneck in my opinion and I think the code generation will have to get pretty good before we can see the value in writing everything down vs just coding as we have always done.

Agreed.

The most compelling part of the demo showcased in this post is the way that the tool built the bulleted list of success criteria -- that's so often a tedious and overlooked part of writing user stories, but its importance shouldn't be understated -- the fact that it bakes that step into the workflow feels like the most valuable piece of the puzzle here.

I only got to use the Workspaces feature for an hour before they fixed the waitlist access check but IMO the real value is that it provides a familiar PR-style interface for the whole process that enables fast iterations.

TFA didn't show a screenshot of it but the per file plans and the diffs are side by side on a single screen so you can update the per file plan (adding and removing files as needed) and then "re-roll" the code changes as you go. With the Codespaces feature you can even launch the project and get access to a terminal to run stuff and presumably feed the output back into the plan.

It makes it really easy to spot deficiencies in the code, add comments in the plan, and instantly regenerate the code (well, not instantly, there was a queue when I used it). It was a lot smoother than my experience with Copilot Chat, Aider, and Plandex.

The same is true on the code level, which can be viewed as a more detailed specification.

Part of the fun of software development is exploring the solution space by implementing, and gaining a deeper understanding in the process, as well as coming up with the corresponding design decisions.

It seems that with current AI, in order to steer it and evaluate its output, you would have to build that deeper understanding up front without doing the work, which seems difficult.

At the moment you have clear which are the requirements you have already solved the program.

Programming is the task of finding the real requirements!

To me this looks similar to rubber-ducking or technical writing. All three involve mentally modeling the perspective of someone who may not share your knowledge or assumptions.
> It's this kind of thing that makes me think tackling big feature requests is still an AGI-complete problem. Perhaps if it gets good enough at pure coding you can iterate your way to success.

I think you’ve just invented product managers. This used to be part of a software engineer’s job. Back when inputting code into a computer was so labor intensive that you’d write your program then hand it off to another human to translate into machine code.

Then we invented compilers and now programming can take up a whole person’s day so programmers stopped having time to do product management. That became a full-time job supplying 4+ programmers with enough work to stay busy.

If we can replace those 4 programmers with AI, software engineers will once more turn back into product managers.

The best product managers I’ve worked with have some combination of a comp sci and business background. The CS background helps a lot.

And some of the best software engineers I’ve worked with are basically their product manager’s right hand. Partnering smoothly in developing requirements, communicating technical feasibility, and deeply understanding their customers. They could be product managers but choose not to.

and that's because software engineering is <50% "writing code"

TDD is a great way to show exactly how much you understand what you're about to build. the make all the decisions about edge cases and various conditions ahead of time, before even getting to the code

it sounds like pseudocode sort of, like analyzing the requirements and needs to a point all that's left is typing it out in whatever programming language you're using. I can see an LLM being pretty good at that but then that's just a higher level version of a compiler going from a programming language to what the machine understands. You start with very well structured human language, the llm turns that into something the compiler understands, and then that is turned into something the machine understands.

It sounds like using an LLM to write code requires careful preparation and wording ahead of time that it's basically like writing in a very high level programming language itself.

Yeah, this is my experience as well. Once I've fully fleshed out the requirements to the point that there is zero ambiguity in what I want, I've basically written a pseudocode implementation already and the AI is just saving me some typing.