|
|
|
|
|
by vichoiglesias
115 days ago
|
|
The self-extending part is wild! agent mutating its own tools mid-run makes trajectory fidelity even more make-or-break. On git hooks: checkpointing state is easy, but if replay isn’t deterministic, bisect over checkpoints is unreliable... you can get different states on replay. Tool evolution is a brutal test case: if the predicate is “does this tool still handle edge X?”, it needs to stay violated once flipped, or binary search happily lies about the origin tick. Genuine question: when a self-built tool regresses, can you actually reconstruct the exact chain of reasoning/commitments that led to it? The artifact is simple to diff, the decision trail behind it is where it gets nasty. |
|