Hacker News new | ask | show | jobs
by moxiemk1 5348 days ago
The the code the article links to for zipmap.c (https://github.com/antirez/redis/blob/unstable/src/zipmap.c) is rather literate.

I haven't dug extremely deeply into the sources for many F/OSS projects; the code I'm interested in reading has been inevitably opaque (at least to my inexperience). This particular source file (and maybe the rest of Redis?) is really good. I think I'll be taking many more looks at Redis (code- and usage-wise) in the future.

1 comments

Back when we were getting started with Redis, the readability / concise nature of the project was one of the things that most excited me about it (here's what it looked like around then: https://github.com/antirez/redis/tree/0b420168b485d0a9c4b66d...)
I'm curious, what in particular makes you say that code is "readable"?

Coding style varies dramatically from person to person, and I don't mean this as a criticism of antirez, but any code which doesn't have at minimum a one-line comment before each function explaining its purpose immediately fails the "readability" test for me. Obviously this isn't a problem for you, so I'm curious to hear what your tastes are.

Hi cperciva, I actually think that comments are not a so important part of code quality. I tend to add comments where my code risks to be not clear by itself, and the zipmap.c code is indeed more commented than my average code since it is all about encoding stuff in a binary blob string, playing with pointers, and so forth. So actually too much comments may even be a sign that something is bad about the code.

IMHO good code should be readable since the purpose of different files, functions, statements, data structures, should be obvious (at different levels of course), and every time it is not obvious there should be a comment helping the reader to understand what is going on.

My idea is that programmers with time develop a feeling about when a comment is needed. For instance a comment is needed all the times you are writing something that avoids a specific problem but you'll likely not remember why it was needed in a few weeks. Other times comments are useful since the flow of the function is complex and there is no easy way to refactor it into many pieces, so comments help to organize the function in smaller conceptual parts, and so forth.

There is no absolute rule, but the reality is that IMHO the test is simple to do for external people: good code is easy to understand and modify without being an expert of that code base.

This topic is a good idea for a blog post, since I thought a lot about this issues lately. For instance if you want a place in Redis where code should be improved is in the handling of blocking operations: there are a few things in that code that are absolutely non obvious even adding comments, and you either are a lot "into it" or you'll not have an easy time understanding it. I'm planning a refactoring of that piece of code.

I used to consider most comments a sign of poorly factored code. (If you need comments, you've probably written your code in an obtuse way).

I started a new project a few weeks ago and went to the other extreme to see what it was like. I think I have more comments than code:

https://github.com/josephg/node-browserchannel/blob/master/l...

Rendered with docco: http://josephg.github.com/node-browserchannel/docs/server.ht...

The comments in this file also describe the spec of what I'm implementing and give running commentary on the implemention. It took me awhile to get used to writing like this, but I really like the result. When I want to add a function or something I talk about what I want to achieve in a comment block before writing the code.

Somehow, the whole feeling of programming is different. By making the documentation the highest priority, I feel like I'm writing rather than coding. (I hate writing generally, but I really enjoyed it here.)

I'm not sure if it helps or hinders readability. Obviously there's way more information in there about my intensions. You can tell if surprising behaviour in the code is intended or if its a bug. But the code has become really long and you have to scroll heaps to be able to move through the file. I feel like my monitor has shrunk. Maybe I just need to split the code out into more files.

I'm not sure if I want to program like this all the time; but I'm writing a lot more inline documentation now. When you get stuck trying to decide how to implement something, talking about the design choices in a block comment has the same effect as explaining the problem to a coworker. Only this way you don't bother your coworkers and you have documentation on the choice when you're done.

Its also a fun experiment because of how weird it feels - I highly recommend giving it a go sometime.

I tend to add comments where my code risks to be not clear by itself

I used to take that position, but I've started adding more comments in order to avoid "mental stack overflows". Suppose I'm reading function A and trying to understand it, then I find a call to function B which doesn't have any comment explaining what it does; I then go look at function B, and find it has a call to function C which is equally lacking in commentary; and by the time I've read the code in function C to understand what it does and gone back to function B to understand what it's doing I've completely lost track of what I was looking at in function A.

Of course, if you already know what most of the code is doing you don't run into such stack overflows because whatever code you're looking at is probably only calling functions you already understand. But for people who are new to the code -- or people who haven't looked at it for a couple years and have forgotten most of the details -- I think asking people to read the code to figure out what a function does is too much of a bar to understanding.

In my opinion, if you can't figure out what a function does by its name and its parameter names, it is a poorly named and thus poorly documented function.

I think function and variable naming is the one of the most important aspects of programming. Without good naming, you can easy double the amount of time it takes to edit and extend functionality.

There's a tradeoff in the naming of functions. If you have lots of functions named functionWhoseNameMakesItVeryClearExactlyWhatItDoesToItsParameters(), your code will end up being hard to read simply by virtue of the volume of text people need to read through. I prefer to have gist_of_functionality() with more details easily accessible in an obviously-placed comment.
A function may have the perfect name at the time you wrote it. It may make perfect sense within the context that you initially conceived of it. However, after some time away from it, when you're trying to mentally rebuild that context, it may make as much sense as def foo().

Good naming is important, but you also have to know the context, which is more difficult to remember.

I don't understand how you can really have a function name that properly says what it does without it being overly long.

In particular, function names seem to focus more on expressing the postcondition but make no mention of preconditions, exceptions, error conditions - all which also need to be known when figuring out what a function does.

In other words, strlen becomes strlen_segfaultifstrnull. You could theoretically encode that in the parameter name, but it becomes even messier when you have functions that raise exceptions. And now instead of long function names, all your functions are using long variable names that serve no purpose other than documenting said error conditions.

Honestly, it was mostly the concision that struck me at the time. Looking at the zipmap source now, I think it's more or less where my code tends to end up in terms of comments--some motivation / higher-level comments at the top, and most functions have a comment before, hopefully with some explanation of any edge/NULL cases as well. Any open-source projects you'd point to as good examples of what looks readable to you? Always trying to improve my own coding habits as well.
I like to think that most of my recent code is pretty readable. The largest chunk of open source is my kivaloo data store (http://www.tarsnap.com/kivaloo.html, browsable svn repository at http://code.google.com/p/kivaloo/source/browse/).
Very readable for something so dense. Nice work!

One suggestion from the peanut gallery:

In http://code.google.com/p/kivaloo/source/browse/trunk/lib/dat..., at line 178 you have:

    if (rehash(H))
        goto err0;
That's because rehash returns 0 upon success, and -1 on failure.

To me, that's very confusing, because it reads to be erroring upon success. When I read rehash's body, I learn that 0 means success and -1 means failure.

I'd prefer one of two methods. First, you could return 1 on success, and keep returning something falsy on failure. Then the code would read:

    if (!rehash(H))
        goto err0;
Which I think is more readable. Perhaps even better would be to use a constant, i.e.,

    if (rehash(H) != REHASH_SUCCESS)
        goto err0;
although that can easily lead to a mess of redundant constants, or a headache managing them.

Just my personal preference. Really nice code, thank you for sharing it.

Very readable for something so dense. Nice work!

Thanks!

you could return 1 on success, and keep returning something falsy on failure

I come from an OS background, so to me 0 is success and non-zero is failure. It doesn't really matter which convention a project uses as long as it's consistent, so I documented this in my /STYLE file: "In general, functions should return (int)(-1) or NULL to indicate error."

I wonder how individualistic that really is.

I like well documented code, I don't trust that documentation to be accurate so I don't include it in my readability score. I like to have around 10 to 40 lines per function, reasonable levels of reuses, a reasonable overall structure and function depth to stay reasonable aka f1 call f2... calls f10 is fine, f1 calls f2... calls f40 smells bad. f1 calls f2 calls f1 is fine though. And stay away from the more colorful parts of the language.

How about you?

I generally don't trust external documentation, but I do trust in-line comments.
You're probably wrong. Look at the comment above "stretches" here: https://github.com/binarylogic/authlogic/blob/master/lib/aut...

This is crypto-related code in of the most popular Rails authentication gems. And this is one among many examples. I'm sure you've already come across something like that more than once.

I trust code rather than comments. At least it doesn't lie.