|
|
|
|
|
by khafra
59 days ago
|
|
"Did the vehicle just crash" has a short feedback loop, very amenable to RL. "Did this product strategy tank our earnings/reputation/compliance/etc" can have a much longer, harder to RL feedback loop. But maybe not that much longer; METR task length improvement is still straight lines on log graphs. |
|
Unless your CEO is Steve Jobs, it's hard to imagine it being much worse than your average pointy haired boss.