Hacker News new | ask | show | jobs
by smeeth 359 days ago
I suspect its something to do with the following:

When humans get stuck solving problems they often go out to acquire new information so they can better address the barrier they encountered. This is hard to replicate in a training environment, I bet its hard to let an agent search google without contaminating your training sample.