Hacker News new | ask | show | jobs
by personjerry 2297 days ago
I agree code history can be used as a form of documentation, but in cases like this looking through years of code to find the decisions/reasons leading to a particular design seems like inefficient communication. It seems like "real" documentation with a few sentences explaining directly would be more suitable.
4 comments

I disagree. In some cases, code history can be much more efficient. You really need a mix of both.

There will be things that are much better captured as part of a revision/commit, especially if your commits are well-designed, grouped into logical chunks, and include messages themselves (and maybe are linked to a project management tool).

You will need information like "this code was added as by X as part of work they were doing on Y, and they also made changes in other parts of the code as part of that". That context is really valuable.

You can think of it like event sourcing, which captures a lot more information than traditional mutation of data, and as a pattern, is a lot more rock solid... except event sourcing for your data is (usually) much more difficult to implement in practice, and code revisions are already an almost completely solved problem.

Capturing most of your data in your commits/revisions seems to suffer from a lot of the unsolved event sourcing issues:

- How do I quickly find the info that I want? How do I "query" the commit log? Often times, we want a "view" of the history which tells us specific info. If I need to scan through half of the commits just to get a good understanding of the architecture of the code, then that's more wasteful than just having a design doc. If I'm troubleshooting a production bug, then the granularity the commit log offers becomes important enough to offset its slow "query" speed, so I'd want enough "why" commits in the commit log and outside of people's heads.

- People write to the commit log without a well-defined "schema". If you use something like tags, how do you handle changes to the tags ("schema evolution")

This is my train of thought for why I lean towards "why" comments near the code or in a design doc over commit messages, which I allow to be a more sloppy.

A higher-level thought: The attractiveness of the event sourcing analogy often comes from assuming that the commit log should be a strongly consistent source of truth. However, it's good to remember that a huge amount of info about the code is stored in the team members' heads. In particular, the code writer knows a huge amount that can't be easily documented. So an alternative analogy would be to think of each member as a VM attached to block storage. If a VM fails (the person gets sick) or leaves the cluster (they leave the job), then you lose all of the data in block storage. So, the team wants to facilitate just enough overhead/admin work to transfer important data from individual team members to shared but slower storage (like the commit log, design doc, comments, etc.)

I recommend checking out Peter Naur's essay "Programming as Theory Building"[0] as it touches on the subject of a program being more than the code + documentation, it lives on its designers' and developers' heads, their intents, visions, etc.

[0] http://pages.cs.wisc.edu/~remzi/Naur.pdf

I personally think of version control history as like a sedimentation layering of documentation that "updates" itself in the process of doing the work -- like the desk that looks messy, but by picking up and using papers, the most important stuff is on top. "Real" documentation can be clearer, but it must be maintained manually, and cleared out regularly. VCS kinda handles this with less process weight.

The right tactic is def a mix of both though, so I think I'm in agreement with you :)

It's because "your commit needs to link to something in the bug tracker" is pretty easy for code review to enforce, but I have never seen an org manage to indefinitely keep an accurate as-built design doc beyond the code itself. You can convince people to write new aspirational design docs for intended major changes, but after approval those never get updated to reflect what really got built, and lots of small bugfixes don't get one at all.
You might find some weird hack in codebase that isn't obvious what it does at the first glance. This isn't something that people document in official documentation but even finding a JIRA ticket that is linked with that specific commit can help tremendously.