The main difference is that this junior employee can't be held responsible if anything goes wrong. And the company which rented you this employee absolves itself from all responsibility too.
How would you hold the junior responsible if they bring down the system and it cost a million dollars? Will they actually recover all the money from the employee ?
Who do you trust with the keys?
In any well run organization you have multiple layers of controls. The same concept applies here and I think the gp commenter captured it very well.
I think you'd trust someone with the keys when they've consistently shown that they can be trusted with less critical work. If you're having to constantly monitor someone's output, then promoting them is a liability.
The same applies to an AI model.
And, since the same model would be deployed by many teams, unexpected behavior from that model even for a small subset of those teams means that it can't be promoted.
> In any well run organization you have multiple layers of controls.
Everything depends on size.
A business with 8 employees might need 3 of them to be (literal) keyholders, and might be situated such that any of the keyholders has it in their power to destroy the business.
This is not ideal, obviously, but it is how the world has worked for a very long time, and it is difficult to understand how to make it better in some cases. Modern technology, such as cameras, might help, or might simply help to allocate blame after destruction has occurred.
In any case, this is the background of how people are used to working. We all deal with people who can absolutely destroy us, starting with the cop on the corner.
And we have mechanisms, both before-the-fact, like social coercion, and after-the-fact, like the legal system, to help ensure that this usually works.
LLMs exist in a world where most people are used to extending trust, but it isn't possible for LLMs to conform to the historical expectations that underpin that trust.
Yes. I think you can get agents to “Conscious competence” with a lot of well-designed oversight, direction and control. It works, but it’s fragile - nothing like the judgement needed to handle novel situations well.
Here is a fresh example from today of what junior employee do when given unlimited agentic power : https://www.reddit.com/r/ClaudeAI/comments/1sv7fvc/im_a_nurs...