| I seriously dont' know all this big hullabaloo about one shot prompting. by definition, a single prompt wont' constitute the complexity of a software project. ergo, what you'll get is a series of assumptions made by the model based on preexisting code in its training corpus. I'd rather see a coding agent that can follow steps in a plan file to a T while following guardrails and adhering to the proper coding conventions in the human reviewed spec. Id rather see performance in agent loops against human defined objectives where it can be verified to stick to defined guardrails and continue without drift till its objectives are complete. I'd also like to see it identify bugs and potential performance increases by identifying existing code and suggesting refactors based on context it can pickup about the particular use case you are trying to create. These are way more valuable metrics than "hey build X" |
But for a more practical issue, the ultimate goal of LLMs is to replace software engineers, or at least enable everybody to become a software engineer, to use a more up-beat phrasing that's no less accurate. And so an LLM's ability to reliably construct something from a poorly defined, contradictory, or otherwise flawed prompt, while accurately inferring intent is probably the first finish line.