|
|
|
|
|
by Ozzie_osman
2297 days ago
|
|
I disagree. In some cases, code history can be much more efficient. You really need a mix of both. There will be things that are much better captured as part of a revision/commit, especially if your commits are well-designed, grouped into logical chunks, and include messages themselves (and maybe are linked to a project management tool). You will need information like "this code was added as by X as part of work they were doing on Y, and they also made changes in other parts of the code as part of that". That context is really valuable. You can think of it like event sourcing, which captures a lot more information than traditional mutation of data, and as a pattern, is a lot more rock solid... except event sourcing for your data is (usually) much more difficult to implement in practice, and code revisions are already an almost completely solved problem. |
|
- How do I quickly find the info that I want? How do I "query" the commit log? Often times, we want a "view" of the history which tells us specific info. If I need to scan through half of the commits just to get a good understanding of the architecture of the code, then that's more wasteful than just having a design doc. If I'm troubleshooting a production bug, then the granularity the commit log offers becomes important enough to offset its slow "query" speed, so I'd want enough "why" commits in the commit log and outside of people's heads.
- People write to the commit log without a well-defined "schema". If you use something like tags, how do you handle changes to the tags ("schema evolution")
This is my train of thought for why I lean towards "why" comments near the code or in a design doc over commit messages, which I allow to be a more sloppy.
A higher-level thought: The attractiveness of the event sourcing analogy often comes from assuming that the commit log should be a strongly consistent source of truth. However, it's good to remember that a huge amount of info about the code is stored in the team members' heads. In particular, the code writer knows a huge amount that can't be easily documented. So an alternative analogy would be to think of each member as a VM attached to block storage. If a VM fails (the person gets sick) or leaves the cluster (they leave the job), then you lose all of the data in block storage. So, the team wants to facilitate just enough overhead/admin work to transfer important data from individual team members to shared but slower storage (like the commit log, design doc, comments, etc.)