Hacker News new | ask | show | jobs
by _carbyau_ 1512 days ago
I see troubleshooting as being more like a the OODA loop: https://en.wikipedia.org/wiki/OODA_loop

Observe - Orient - Decide - Act

But it might be more palatable as: Observe - Understand - Plan - Do

Often, the issue will be a symptom that you don't necessarily understand and so the "Plan - Do" part of the loop will aim to get more information so as to better understand. IE maybe an experiment through change, maybe simply gather some logs.

But in a large organisation working on an urgent problem it can be hard to have everyone involved being in the same stage of the loop with the resulting chaos you'd expect.

2 comments

The OODA loop is explicitly as you said

> Observe - Orient - Decide - Act

The "Orient" step is usually done inside of your existing mental framework (Boyd is pretty explicit about this, see the blue box in your linked wikipedia article).

Now if you are a great troubleshooter operating in your field of expertese, your "orient" step (and mental framework) might be sufficient to find and solve the trouble. Think an experienced engineer noticing some issues in a junior's code submission.

But OODA is not so good for solving problems where you need to extend your mental framework.

Your proposed:

> Observe - Understand - Plan - Do

could very well be a good approach, but it is not at all OODA, especially if the Understand phase involves a re-evaluation of your mental framework for the problem.

Hence all the "whys". You are questioning the situation and the way you think about the situation.

That blue box specifically has "new information" and "analysis and synthesis". If your previous Act step got you new information you can "orient" yourself with respect to it.

OODA is often thought of in a "do or die, moments count" context (given it's origins...) where learning lots of stuff is often impossible hence irrelevant. But with a longer time scope you can learn more. And given 10 minutes a person familiar with the environment can learn a lot about an issue, without having solved it necessarily.

Applied to troubleshooting, so often the Decide and Act bits are about gathering more information! Hence my "logs" or "expt" examples. Maybe it is breaking out a debugger, or gathering stats on A vs B, but each Action gets one step closer.

The main reason I like to think of it this way is to prompt "stepping back". By being able to say "Where am I at in the loop?" it gives permission to yourself to mentally step back from the coalface for a minute while still not giving up on the issue.

And every time I have seen successful leadership in resolving a problem it is effectively someone acting kind of like a "flight controller" making sure the workers OODA loops at least don't mess each other up and better yet are collaborative. IE keep them bubbling until someone says "I've got it!".

For me, OODA is valuable in the post incident review. You can lay out every step and figure out how to get there faster.

Do you find ooda useful for general problem solving? I’ve only found it useful in dynamic environments, especially when there are other agents who act in addition to react.

The theory is that you want to get inside the other agents decision loop, so you can iterate faster than they are. For general problems, I have not found it very useful - curious how you use it for a typical engineering problem, if you wouldn’t mind sharing an example.

Like a lot of thought models it depends how you apply it as to whether it is any use.

OODA came from an adversarial practices. So often the intent is to be competitive vs the adversary. But if your adversary is inert - a problem to fix - then your time pressure to "get inside their loop" is gone. But of course you have other time pressures - assumedly management or someone screaming for you to fix it. Or maybe it is a memory leak that you know will hit a critical point.

From my reply to a sibling comment:

OODA is often thought of in a "do or die, moments count" context (given it's origins...) where learning lots of stuff is often impossible hence irrelevant. But with a longer time scope you can learn more. And given 10 minutes a person familiar with the environment can learn a lot about an issue, without having solved it necessarily.

Applied to troubleshooting, so often the Decide and Act bits are about gathering more information! Hence my "logs" or "expt" examples. Maybe it is breaking out a debugger, or gathering stats on A vs B, but each Action gets one step closer.

The main reason I like to think of it this way is to prompt "stepping back". By being able to say "Where am I at in the loop?" it gives permission to yourself to mentally step back from the coalface for a minute while still not giving up on the issue.

And every time I have seen successful leadership in resolving a problem it is effectively someone acting kind of like a "flight controller" making sure the workers OODA loops at least don't mess each other up and better yet are collaborative. IE keep them bubbling until someone says "I've got it!".

For me, OODA is valuable in the post incident review. You can lay out every step and figure out how to get there faster.