|
|
|
|
|
by bfeynman
56 days ago
|
|
How is it not analogous to data leakage? The claim is that the system works autonomously, or at minimum could, but there is effectively signal via human in the loop feedback. That's leakage into test time evaluation.
Also the coding analogy is malappropriated, in that the llm is using its own signals autonomously in the environment.
Using a kalman filter on a ICBM with its own sensors is analogous to the coding agent and is autonomous. A system where a human is course correcting based on signals/sensor data is what's presented here, that is not autonomous. |
|
the human isn't course correcting; the agent is course correcting based on the feedback data; the human is just inputting the feedback data to the agent in cases where the agent isn't able to access that data itself (due to the tooling not yet being in place for such)
data leakage would be the following:
but in this case: