Hacker News new | ask | show | jobs
by PaulRobinson 4047 days ago
TL;DR key recommendations:

- Once a project is completed, the team must ensure that the “What” and “Why” of each software item are properly documented.

- In the cases of parallel development of inter-dependent software modules set up a negotiation table to solve conflict between the development teams.

- Make sure that the development team is aware of the CMMI-ACQ or ISO12207 processes for negotiating with third parties.

- Make sure that testers are involved when negotiating with a third party for a potentially vulnerable software component. 

- Plan organization-wide process reviews to detect isolated processes and to promote information flows between processes.

- Planned special budget items to support long lasting corrections or corrections that are likely to benefit many modules. 

- Projects with strict deadlines are risky, and should be carefully monitored to avoid last minute unplanned activities.

- Team members should maintain a careful balance between the flows of information within formal development processes and informal human interactions.

- Team members should make sure that knowledge is appropriately distributed amongst them. For example, pair programming is a practice which can promote knowledge sharing.

- Any intrusion into the team dynamics by outsiders should be done very carefully.

3 comments

I think there are a lot of conflicting ideas at play here. As a coder, yes I can write good code. But however, everyone I know in the valley tells me "just shut up and launch, it doesn't have to be good. You should have launched yesterday, that's what they would have told you at YC". This advice have a lot of truth to it, because you need to get feedback, validate your project, and perhaps have the first-to-market advantage in the highly competitive world of technology today, with investors and customers hounding you on both sides. However, this hasty, rushed culture necessarily will result in a lot of poorly-written code being deployed (followed by a lot of rewrites), even by decent coders, because they simply aren't given the time and runway.
The situation is very different when creating software for internal use in large corporations, which I think was the case with this study. With a startup, launching early can get you the advantages you mentioned, and your code quality does not necessarily have to be high because you generally will not have many users at the start. Large internal software must be highly functional and performant from the first deployment because it will immediately be used by the whole company and will often affect revenue streams. This allows low quality to have a huge impact from day 1.
If you write code that solves a people problem and allows your project to move forward, no matter how stinky that code is, that code is "good code".

By nature, we programmers often judge our code on its technical merits. But a stinky piece of crap that I might write that shows me clearly what I need to build, or demonstrates the techniques that are required, or shows that something isn't going to work early, etc can have immense value.

Writing stinky code is a tool that every programmer can use. The trick is to know when you should use it and when you should not. If done properly, you can solve some intractable problems dealing with requirements gathering, design, or even political problems. When used improperly it can have exactly the opposite effect -- it pushes requirements gathering back, or defers important design decisions until it is too late, or creates conflict within the team.

I know some people in the industry who are pretty good at this, but I know of nobody who is a master at it. Not only do you need to be good at it personally, you need to be able to influence the team to coordinate their work in a way that is not destructive. This requires you to have impeccable taste, amazing technical ability and sublime communication skills. It is a skill that I wish more programmers would value and work towards.

If the code is stinky, it's bad code. It can still be have business value, which is why somebody coined the term "technical debt".
I get really crabby about bad commit messages because on some projects, the commit history is the only 'Why' you ever get.

Scratch that, on MOST projects that's the case.

I'm a fan of whys. Even for yourself. I have code I have to maintain that I wrote 10 years ago.

But it can make for long commit messages. Why also includes what alternatives you considered, and why you didn't use those.

I also like to have a digital "worksheet" for each change I do, where all my thoughts and research goes. So if all else fails, I can reference that. But no-one else can, so I like to transfer as much knowledge as possible to the commit message. At the same time, some of these go on for 5 or more pages. They also tend to be very messy. I'm not sure if it's all appropriate for a commit message.

In my case, only key whats and whys go to the commit message. Most of the discussion is left in the issue tracker.

(Of course, the commit message refers to issue ID, and vice versa.)

Is there any scenario under which a long commit message is detrimental, even if "messy"?
It's an issue when dealing with management or clients. They can see long commits as a problem with someone with too much time on their hands.

The whole point of a version control system is that it contains everything related to the code. On lots of web dev and game projects you also commit the finalized assets (the generated javascript from coffeescript, the compressed 3d textures, etc. etc.)

That's not bad commit messages, that's bad client management.
IMO most of your "why" ought to be code-comments, not commit-messages, because most of the time people say: "Why is this code this way" as opposed to "Why did this specific transition occur in the past".
I completely agree.

There are man times when I have seen a crappy piece of code that I wrote a while ago, and decide to "tidy it up". The I test it, then remember there is a strange edge case that required me to write it the "crappy" way rather than the clean way. It always gets commented the second time.

There's nothing worse than a project full of bad commit messages.

There was one project I saw that had a two line commit message, and there were 29,000 changed lines across 44 files. Who does that?

I'm looking at a commit message from one of my clients today, which is just "Not, new bins, cant stop bins" and a check-in of several hundred dlls generated as build artifacts. Six days later another developer at the client checked-in a commit that removed all of the dlls, with the message "remove BIN folder".

Some other examples: "Fixes to make work", "Left Over", and their most common message " ".

They're nice guys to work with, but their VC habits are awful.

I have to admit, I've made number of commits with "." as the message.

No excuse, beyond it usually being a minor change well documented in the code ( I always write comments ) and me being utterly buried under work... You know, start the day with 20 things to do, crack off 4 of them and have 23 things in queue at the end of your day.

The "." was sort of a placeholder for "Fuck This, I'm ready to quit." ( and I did eventually )

That's when I prefer to use `git amend` or rebasing. I'll make "WIP doing things" commits, and then later squish a few together before making the code more public.
I could see the latter being something like "Apply coding standards", where somebody went in and fixed all the lines that went over the right margin or had wonky whitespace.

Of course, we can debate over and other whether mass-correcting existing files is actually helpful (one unofficial rule we have here is "only use the auto-formatter on code you've personally worked on or have taken over responsibility for", because it affects the blame history).

Also, I changed my name last year, and when I did that, I ran a mass find-and-replace to correct my credit in every Javadoc I'm credited on (I particularly detest my deadname, and I want it dead and buried), with the commit message being something like "Correct my credit to match my new name".

Guilty as charged. Or maybe 'nolo contendere'. I just dropped 5kloc, 6 weeks work, into a repo with the message "Initial commit"

I'm currently working on the changes that will actually document what is otherwise a walk of code.

The road to hell is paved with good intentions.

I can one-up you there: one of my "initial commit" messages was a dedication to my cat, who passed away a couple of days before I checked the code in. Better yet, we released that project as open-source, so that commit message is still floating around Assembla for all to see.
I could see that. Initial imports and all that.

However, my example was a project that was around for about 3 years at that point, and it was basically just labelled: "Upgrade to version 2.0".

Face → Palm

I've made that commit before. It was actually an initial import of a from-scratch rewrite that happened to have utility libraries and such in common. Delete everything, drop in the new files you've been working on elsewhere, make that a commit. Not everything changes, because some things are the same by coincidence. Otherwise, it's just a discontinuity.
Eh, uninformative commit messages aren't a sin, they're a trade-off. Right now, most of my commit messages are one word. Why? Because at the moment nobody cares about my commit messages because nobody cares about my project because it doesn't yet do anything useful. Getting to the point where it does something useful has higher priority than writing long messages nobody will read. Once the commit messages have an audience, then I'll put work into writing better ones.
It's a self fulfilling prophecy though. If you want to be important you have to act important.

Yes, nobody cares about your commit messages until there's a bug or a major refactor. By then, if the commit messages suck, it's too late to do anything about it.

"Programs are meant to be read by humans and only incidentally for computers to execute"

Is never more true than when you're trying to fix bugs or make improvements. A project you can't improve is a dead project.

This just reminded me of whatthecommit. IIRC they get these from actual commit messages; if they do, there is no hope. :-)

[0] http://whatthecommit.com

I once saw a coworker commit a reshuffling of a project's directory structure with the message "YOLO". We made fun of him for a while for that.

(edit: He made this change after a lot of discussion with a bunch of people, and he also sent out a mass email describing the changes to the structure, so he wasn't totally being irresponsible there.)

https://github.com/janraasch/javascript-commitment/blob/115d...

I'd assume the keys are the commit hashes? Now how to search github for them...

this line made my day: ' "7142cd872a703392c1b094a18a1e229e": "LAST time, XNAMEX, /dev/urandom IS NOT a variable name generator...",'
I'll take a sound codebase and zero commit messages over a poorly structured codebase any day.
This choice makes no sense but.... hmm, really? I think I'd rather have good commit messages that explain the "whys" that went into the code.

Code typically isn't rocket science. It's the human knowledge that goes into it that's irreplacable.

Example 1: OK, you're using a third-party CSV parser instead of the one built into the standard library. Why? If your code is crap but well-documented, I can read what you were thinking: "Using non-standard CSV parser because the standard one chokes on files bigger than 2gb" At that point I can refactor your code, or perhaps see that this issue has been fixed in a newer version of the standard library. Or maybe I realize that you confused gigabits and gigabytes and you made a bad choice in the first place, and I realize I can safely remove this dependency. But if your code is tight but undocumented... I would have no idea why this third-party library is being used unless I do some painful trial-and-error that still might not definitively answer the why.

Example 2: You inherit Mary's code. It calculates commissions for our salespeople. The code is sloppy and convoluted, because the sales guys change the commission formulas every month... and these changes have been happening for over ten years, often on very short notice, often contradicting basic assumptions made when the software was originally architected. But Mary documented every change. Which is good, because the fucking sales guys sure don't. Her code is literally the company's only coherent record in the entire company of the commission process. Remove her comments and commit messages, and none of the code would make sense, even if it was tightened up into a sounder codebase of seven modules with 300 LoC each instead of ten modules with 500 LoC each.

So yeah. Totally fictional choice but I'll take documentation every time. Code is just code, I can fix it.

(Both those examples are fictional, but I've been coding professionally for nearly twenty years and I've seen variations of them countless times...)

In both of your examples, I believe comments in the code are the real winning strategy, rather than the individual commit-messages.

99% of the time what you want is to understand the current code--or at least code at a specific past point in time--as opposed to every transition that occurred.

For the CSV parser, I'd rather see a comment ("/* We use this for >2gb support */") or a test case ( testOverTwoGigsParseable() ) would be a lot more useful than any level of discipline over commit-messages.

For Mary's commission-calculator, it sounds like nobody has access to good "whys" anyway, because they boil down to "salesguy X insisted on it". Instead, the commits are functioning as an auditing/blame tool.

That statement makes no sense. It's not like anyone ever has to make a choice between making sound decisions while coding and making sane commits and commit message.
Well, yes, but it's true :-D
"updates"
fixed indentation updated to pass new code linter
That's still a why. An irritating one, sure, but it at least tells me to look at the previous commit to find the real 'why'.
I had a local git repo with a month's worth of commits. I hadn't pushed any of them so they were all _just_ local (though the head was constantly being uploaded to an online store).

I got a new computer and when I was getting rid of my old one, it didn't occur to me to push all the commits or save the git repo. So, my next commit consisted of all the changes for that entire month.

It sounds like you forgot to copy the .git folder that had all of the commit information.

If you make a commit, it would copy over with that, because it's written in that folder.

I'd wager that you simply copied the main files over and tried to re-commit.

but wouldn't those be represented as individual commits (just happened to be pushed at the same time)?
Sounds like he lost the local repo, and only the actual final state of the files was backed up. Thus, a single new commit representing everything at once.
yes, that's right. When I got rid of my computer, I didn't save the folder containing the repo; all the commits were lost.

I was able to retrieve the full final content from the place where it was being uploaded to, but that didn't have any git-related files.

What was the commit message?

"Issue #27" is terrible. Something like (in C#/VS) "Execute CodeMaid against solution" would be perfectly fine.

Depends on how you have things set up. I once worked at a company with SVN/Trac integration. If you put "Refs #27" in a commit message, it would actually add a link to the commit as a comment on the ticket. If you put "Fixes #27", it would do that and close the ticket. Our system was also set up so you had to ref or fix a valid ticket in order to commit.

I miss having a system like that...

That doesn't sound Agile.
Really? Let's review: - Once a project is completed, the team must ensure that the “What” and “Why” of each software item are properly documented. That's essentially a retrospective.

- In the cases of parallel development of inter-dependent software modules set up a negotiation table to solve conflict between the development teams. Role of the scrum master to remove impediments.

- Make sure that the development team is aware of the CMMI-ACQ or ISO12207 processes for negotiating with third parties. Previous bullet point.

- Make sure that testers are involved when negotiating with a third party for a potentially vulnerable software component.  Testers are stakeholders in this, they should be there.

- Plan organization-wide process reviews to detect isolated processes and to promote information flows between processes. Removal of impediments.

- Planned special budget items to support long lasting corrections or corrections that are likely to benefit many modules.  Got me on this one.

- Projects with strict deadlines are risky, and should be carefully monitored to avoid last minute unplanned activities. - Team members should maintain a careful balance between the flows of information within formal development processes and informal human interactions. - Team members should make sure that knowledge is appropriately distributed amongst them. For example, pair programming is a practice which can promote knowledge sharing. - Any intrusion into the team dynamics by outsiders should be done very carefully

These last four are key elements of Scrum - managing the burn down, be flexible, and ensuring chickens can not interrupt the pigs.

Agile does not mean there is no process or formal rules. It's not a free for all, agile is about the ability to quickly respond to change.

Standardized processes defined by committee like CMMI-ACQ and ISO12207 are pretty much the exact opposite of Agile, so, yeah.
> Standardized processes defined by committee like CMMI-ACQ and ISO12207 are pretty much the exact opposite of Agile, so, yeah.

Meh. All the Agile Manifesto really says about process is:

Individuals and interactions over processes and tools and Responding to change over following a plan

Nothing says you can't (or shouldn't) use a pre-defined process, or have a plan. If anything, the core of what "Agile" is, is about being flexible and responsive to change. As long as your process allows for that, it can be implemented as an Agile process even if it was created by a committee.

That's not to say that most firms using things like CMMI aren't doing it in a way that is far removed from Agile principles, but I blame that on the implementors more than the process. YMMV.

I don't have enough context to determine your tone. Are you saying that the above suggestions are not Agile, and so are not compatible with Real Software Development (or at least default to that state)? Or that the above list brings into question the Agile methodology? Or something else entirely?

I'm very interested in the processes of developing good software so I hope there is some good discussion around this paper.