| HN Mirror

Correct, I think it is better to see it as multiple stages. Investigation stage might spin off tasks to read files, perform searches online, summarise the request. Then ‘main stage’ where it performs changes. Afterwards indeed the testing+fixing stage where it verifies the results and potentially performs a couple fixes. These plans are predictable and the models learn which steps are relevant first particular projects.

For context, relevant information from steps can be cherrypicked to next stage.

The math works differently because AI (mostly) ignores irrelevant results. So steps actually increase reliability overall.