| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by sermakarevich 51 days ago

My strong opinion is that every SE should go for agentic engineering these days: understand the limitations, how to shift them, and how to squeeze the maximum possible out of AI. The AI boom is seeing hundreds of billions, close to a trillion, in investments — with yesterday's announcement from China about investing $300B in a nationwide grid of data centers. Models are becoming smarter, and the gap where yesterday's models were unable to do something is closing fast. I look at agentic engineering from the complexity perspective. Give an agent a simple task like inserting an assertion and you can be 100% sure it will do it correctly. Give it a complex task made of many subtasks in a complex codebase and it will fail. There is a boundary of complexity that a single call to an agent can handle. We can push this boundary if we decompose the task into many small ones and provide a sufficient description of what has to be done.

I found the Spec Driven Development approach works nicely for me, and I've been using it since Feb 2026 for all my mid+ size projects. I started with the GSD plugin, but it soon became too heavy, so I implemented my own lightweight SDD-based workflow for Claude. A friend of mine ported it to gemini-cli, and that version was added to Google's approved third-party frameworks for internal usage. The idea is to decompose feature implementation into multiple steps, task implementation into multiple subtasks, and be able to clear the context after every task/implementation.

Repo: https://github.com/sermakarevich/sddw

Slides: https://docs.google.com/presentation/d/1SjKXF7hkoqyiN9-3tBGY...

When SDD was not enough, I started playing with scaling a single AI worker to multiple, and then got to an agent swarm. Built on top of a centralized Beads database, claude -p headless execution, a UI, a custom ask_user MCP, and Telegram integrations, fleet (the app name) lets me add many tasks in advance, control the number of workers executing them, and use any kind of coder/model. It works nicely with the SDDW implementation phase. It shines when you keep creating tasks, define dependencies between them, and give clear descriptions. For personal projects I can queue up 70 tasks for an overnight run, set the number of workers to 1 to not be blocked by usage limits, and let it roll.

Repo: https://github.com/sermakarevich/fleet

Slides: https://docs.google.com/presentation/d/1O_pXyKdtpRG2ORD1xw7s...

Since Fable 5 appeared, I've been changing the way I work with fleet. Instead of adding tasks/descriptions/dependencies to fleet myself, I talk to Fable: specify the goal, ensure understanding, and let Fable 5 add tasks to fleet. Fable is expensive, but in this setup it doesn't code — it just investigates, designs, decomposes, and creates tasks. Workers use the cheaper Sonnet 4.6 model.

Reliability comes with task implementation decomposition into multiple steps, feature decomposition into many smaller and simpler subtasks, having better description, clean and focused context.