Hacker News new | ask | show | jobs
by zgm 4137 days ago
1. Start at the simplest subsystem, and work from there.

This is especially true when working with third-party, legacy, or undocumented code. In the absence of documentation, your only way forward is to read the source code[0]. Find the main() of the simplest/smallest module in the system, and begin tracing through the code. This could be with a debugger, print statements, or the search functionality in your editor/IDE (Find Usages and Go to Declaration in Intellij are lifesavers).

2. Don't be afraid to break things apart.

If the simplest module in the system is overwhelmingly complex, start commenting out parts of the code. Go until you have effectively reduced it to "Hello, World", if you have to. From there, you can gradually add features back.

3. Constantly test your assumptions.

Don't assume comments do what they say they do.

Don't assume that config flag produces the behavior the documentation claims it does.

Do take inventory of you assumptions whenever the behavior of the system contradicts your current understanding of how it works.

You should be able to back up any claims about the system with empirical evidence e.g. when I change A to B, X happens; if I change A to C, Y happens.

[0] http://blog.codinghorror.com/learn-to-read-the-source-luke/

2 comments

Intellij does a very good job for most of my programming and I use it daily.

In the process of learning a distributed system, the process of looking at the code as its documentation is not the ideal way. One could spend enormous amount of time to just understand a simple business usecase.

But then just because it takes time doesn't make it untrue.

I would rather accept a system which self documents itself with every change.

The point of breaking things is all well and good but to truly understand everything, after breaking also one needs to put the together one by one and understand their interactions so that if any one in the pool of modules fails, one knows what exactly would have caused it and how to be more resilient towards failures.

What do you use in IntelliJ to self document? I have a hard time reviewing code effectively, so I'm curious about your strategies there.
I think there is some confusion here.

What i meant was applications should be self documenting, something like swagger comes to my mind.

It's completely different from person to person, but I use a whiteboard to draw up different components and their interactions. Then work my way down subsystems when I need to. Having that overview is very beneficial.