| > long contexts are still expensive and can also introduce additional noise (if there is a lot of irrelevant info) I think spec-driven generation is the antithesis of chat-style coding for this reason. With tools like Claude Code, you are the one tracking what was already built, what interfaces exist, and why something was generated a certain way. I built Ossature[1] around the opposite model. You write specs describing behavior, it audits them for gaps and contradictions before any code is written, then produces a build plan toml where each task declares exactly which spec sections and upstream files it needs. The LLM never sees more than that, and there is no accumulated conversation history to drift from. Every prompt and response is saved to disk, so traceability is built in rather than something you reconstruct by scrolling back through a chat. I used it over the last couple of days to build a CHIP-8 emulator entirely from specs[2]. I have some more example projects on GitHub[3] 1: https://github.com/ossature/ossature 2: https://github.com/beshrkayali/chomp8 3: https://github.com/ossature/ossature-examples |
Now the coding agent starts fresh each time and its up to you to understand what you asked it and provide the feedback loop.
Instead of chat -> code, I think chat -> spec and then spec -> code is much more the future.
the spec -> code phase should be independent from any human. If the spec is unclear, ask the human to clarify the spec, then use the spec to generate the code.
What happens today is that something is unclear and there is a loop where the agent starts to uncover some broader understanding, but then it is lost the next chat. And then the Human also doesn't learn why their request was unclear. "Memories" and Agents files are all ducktape to this problem.