Hacker News new | ask | show | jobs
by pshirshov 17 days ago
A very long post about a simple and very obvious idea with many different implementations.

The three main problems are 1) API usage is deadly expensive 2) Claude is about to make all automation very expensive 3) all the flows where a model has the initiative are strictly biased towards unwarranted stops (checkpointing).

Also, I won't call that "backpressure", there is no producer-consumer disbalance or something similar. From what I can see, the author just proposes a structured feedback loop. That's a discussion about organizational principles for system which consist of multiple unreliable but very complex components and this "backpressure" is just one of the aspects. Personally I find the viable system model framework productive as both a mental model and literal implementation guideline.

Lesser problem is that agent SDKs are bad and building a custom harness is hard.

5 comments

To stop agents from pausing for checkpointing, you can have a deterministic outer loop that re-runs until a stop condition is met.

I think teams need to be able to write nested workflows that transition between code-led and agent-led, with either supporting human-in-the-loop checkpoints.

Been iterating on what this should look like at our startup (https://www.amika.dev/). Model labs are also improving capabilities here, such as Codex's `/goal` and Claude Code's dynamic workflows[1]

The points about API usage cost still stand, but model intelligence is getting cheaper every month! No need to use the frontier model for every part of the work.

[1]: https://code.claude.com/docs/en/workflows

It's hard to get that outer loop done, especially considering that Claude doesn't let you automate the harness anymore (it gets prohibitively expensive). Same for gemini. The only option is Codex.

/goal is a dynamic workflow itself, from what I know. Dynamic workflows do not hold the initiative (and can't use any libraries or I/O).

Dynamic workflows do not prevent checkpointing.

I don't see the actual point of your startup, it's a cheap idea - such as most LLM startups out there.

I don't see how models are getting cheaper - I clearly see the opposite trend.

Claude Code's dynamic workflows are AI-generated JavaScript, so unlike `/goal` they can in theory import libraries and perform I/O (not sure that they can currently).

On checkpointing: I explained myself poorly. You're right that using higher level workflows doesn't turn off checkpointing. One can simply make harnesses non-interactive, but that can make models lose coherence over long tasks (because they can't ask for feedback). A higher level coordinator (/goal, CC dynamic workflows) is designed to provide this feedback without human intervention.

On price: older models keep getting cheaper, and most tasks don't need frontier capability. (I'm ignoring the part about subscription subsidies right now, and just talking about API price for tokens)

On my startup Amika: we run programmable cloud computers for agents, plus the workflow systems to guide them. We let people run any agent (Codex, Claude, etc.), prompt it from anywhere (Slack, web, CLI + SSH, API). It's like devboxes for humans + agents, with guardrails[1] to deterministically ensure things about the changes coding agents make (ie don't let agent modify module boundaries, require every DB query carry a multi-tenant org ID filter).

Maybe our website is bad at explaining it, in which case I appreciate any feedback!

[1]: https://docs.amika.dev/guides/code-annotations

> all the flows where a model has the initiative are strictly biased towards unwarranted stops

Can you elaborate on what you think causes such a bias? My experience is that Qwen3.6, Claude Sonnet 4.6 and Opus 4.6/4.7 will work as far as they can given direction and a way to test their work. My so-far limited experience with Opus 4.8 is that it does stop somewhat earlier for feedback, but in places where I am glad it is checking assumptions or where I agree with it identifying a change in scope (for example, where the following work deserves a separate commit or merge request). I would call those justified stops rather than unwarranted.

Ask Claude! It will quote its constitution aka soulfile. It says the constitution instructs it to perform regular checkpointing no matter what.
You can still do backpressure and auto review workflows all you want on your CC subscription. You just have to start the task using the Claude interactive CLI, and implement your multi-agent backpressure mechanisms using Claude Code's native Subagent system with skills that tell the model to trigger them (for any subagents you want to be Claude models).
The problem is not "backpressure", that's just one of the tools and there are different approaches with the same effect.

You can't express orchestration in terms of "backpressure" only, I think.

Implement-Review-Repeat loop does not involve backpressure in the strict meaning of the term.

> 2) Claude is about to make all automation very expensive

Wait, what happened here??

They will charge use of -p and agent SDK at API rates since 14 June. So, x20..x50 price increase.