|
|
|
|
|
by pshirshov
25 days ago
|
|
It's hard to get that outer loop done, especially considering that Claude doesn't let you automate the harness anymore (it gets prohibitively expensive). Same for gemini. The only option is Codex. /goal is a dynamic workflow itself, from what I know. Dynamic workflows do not hold the initiative (and can't use any libraries or I/O). Dynamic workflows do not prevent checkpointing. I don't see the actual point of your startup, it's a cheap idea - such as most LLM startups out there. I don't see how models are getting cheaper - I clearly see the opposite trend. |
|
On checkpointing: I explained myself poorly. You're right that using higher level workflows doesn't turn off checkpointing. One can simply make harnesses non-interactive, but that can make models lose coherence over long tasks (because they can't ask for feedback). A higher level coordinator (/goal, CC dynamic workflows) is designed to provide this feedback without human intervention.
On price: older models keep getting cheaper, and most tasks don't need frontier capability. (I'm ignoring the part about subscription subsidies right now, and just talking about API price for tokens)
On my startup Amika: we run programmable cloud computers for agents, plus the workflow systems to guide them. We let people run any agent (Codex, Claude, etc.), prompt it from anywhere (Slack, web, CLI + SSH, API). It's like devboxes for humans + agents, with guardrails[1] to deterministically ensure things about the changes coding agents make (ie don't let agent modify module boundaries, require every DB query carry a multi-tenant org ID filter).
Maybe our website is bad at explaining it, in which case I appreciate any feedback!
[1]: https://docs.amika.dev/guides/code-annotations