| HN Mirror

We can cheat on discipline if we design our workflows with more careful thought about incentives.

I wound up in a role where I throw away 100% of the code that I write within a few months. My job is about discovering cases where people are operating under false assumptions (typically about how some code will or won't be surprising in context with some dataset), and inform them of the discrepancy. "proofs" would be too strong of a word, but I generate a lot of code that then generates evidence which I then use in an argument.

I do try to be disciplined about the code I rely on, but since I have no incentive to sneak through volumes of unreliable code before moving on to the next feature, it's easy to do. When I'm not diligent, the pain comes quickly, and I once again learn to be diligent. At the end of the day I end up looking at a dashboard I had an agent throw together and I decide if the argument I intend to make based on that dashboard is convincing.

Also, agent sycophancy isn't really a problem because the agents are only asked to collect and represent the data. They don't know what I'm hoping to see, so it's very uncommon that they end up generating something deceptive. Their incentives are also aligned.

I think we can structure much of our work this way (I just lucked into it) where there's no conflict of interest and therefore the need to be disciplined is not in opposition to anything else.