Hacker News new | ask | show | jobs
by uberalex 4256 days ago
I think that the issue is that just measuring simplicity in terms of number of lines is a bad metric. You can have extremely complex expressions in a single statement that are at least as hard to read and debug as an equivalent, much longer piece of code that employs temporary variables and single-purpose statements.
2 comments

I disagree fundamentally.

I have noticed every page I scroll causes a comprehensive loss of around 90%, so in reading something that is 10 pagefuls long, I might only be able to produce a tiny part of the program.

Your milage may vary.

I find not scrolling, and just moving my eyes, I rapidly absorb the program, and I find most bugs just by reading the code. This practice is absolutely impossible for me if I have to scroll very far and made difficult by scrolling at all.

It is for this reason that I find simply counting the actual words to be an excellent estimate of complexity.

By the way: There are several temporary variables in that code; c:: creates a view called "c" which automatically updates whenever the dependent variables on the right side change.

Yes, the research literature on software development has consistently found that code size is the best measurement of complexity and predictor of error rates. (Sorry I don't have citations handy but we've discussed this many times on HN, and there's a recent study in the book "Making Software" that adds to it.) What's interesting is how strongly this goes against what most people think they know about good programming and clear code.
I just had to troubleshoot a small helper app that took some HTTP input and wrote to a DB. The code was in C# and had about 10 files spread over 3 namespaces, plus a separate test infrastructure project. All sorts of factory models were used to setup an "HTTP pipeline" and authentication modules. The problem I had to fix: after a server upgrade, authentication was broken.

After digging around for a while, I discovered there was no bug. The partner's client code had the auth disabled, and the pervious server was misconfigured to not require auth. All which would not have been a problem if the system just did an "if headers.auth != "Basic ..." - but buried in this forest of stuff, it was overlooked.

It seems that some developers just love their edifices. They build all this "infrastructure", expanding code by an order of magnitude or more. It's considered good and robust and so, so much writing online is dedicated to this pursuit. I think it gives those programmers a feeling of import, as if they're really architecting something, not just pushing a few form fields around.

Even on the line by line basis, it's shocking how they love verbosity. Type inference? Nope, that makes things too compact and hard to read. Higher order functions to wrap up common patterns? Too difficult to understand. I'm not sure if developers simply lack the tiny bit of extra intelligence, or if they've tried it and honestly concluded that overflowing verbosity is the key to readability. Either way, it's sad, and holding back progress slightly.

Right, there seems to be a group thinking like this and a group aggressing against it and vice versa. I recently had a discussion about it and the 'architecting' bunch (we need 20 layer deep directories with 1000s of file with < 10 lines / file) keep shouting about maintainability. The problem is, that after 25 years of professional coding in many different circumstances, I see that most good programmers are much quicker to understand the 'non architected' (putting between '' because good code is not gibberish, it is architected but not by randomly generating design patterns and applying them) and the not so good programmers say that the 'architected' code is much more maintainable but take weeks or months longer to do anything worthwhile as they are 'grokking the architectural choices'.
If someone produces smaller and faster code than me, then I should want to learn from it. I wonder why other people have the exact opposite reaction.

Why do you think that is?

I think that "It's what I'm used to." is the main reason - intellectual comfort zone.

Having learned BASIC, FORTRAN and Pascal, C seemed like line noise - at first. As did PERL. And then k.

Btw, COBOL seemed "too verbose".

Once I actually started writing many k programs and then reading even more of them, I was able to recalibrate for the abstraction/density. I moved my intellectual comfort zone. Ironically, I was already there with mathematics. However, programming languages were different :).

Now, as a result, every time I have to read Java, I suffer from a kind of fatigue - having to read way too much code to glean the writer's intent. I just want them to get to the F'ing point.

N.B. - Mathematical literature/writing went through this same transition during the Renaissance. Equations were described in natural language (not unlike COBOL). A simple polynomial could require a paragraph of text to describe.

I'm not sure -- I know that after the fourth or fifth time solving a problem on projecteuler.net in 20 lines of code and seeing someone post a 1-line J/K solution, I went and downloaded J. I even managed to solve a few euler problems with it, which I regard as a large accomplishment for a novice. I like to tell people I've written a whole twenty or so lines of code in J!
I do enjoy learning about such things, but, for most of the work I do, performance is nowhere near at the top of the list of things I care about. Also in the past I've been burned by code that's small/fast but is otherwise utterly unmaintainable. I'm not saying that's the case here, but... past experience, and all that tends to color perceptions.

I think with a language like k or q, which appears to be purpose-built for certain types of problems, people look at it and get easily confused and discouraged because it's so different from all the more mainstream general-purpose programming languages they're used to. And it's a lot easier to put down something you don't understand than to admit you don't get it, or to spend lots of time learning something that may not be of much use to you. Kinda sucks, but it's often human nature.

> I think with a language like k or q, which appears to be purpose-built for certain types of problems,

The thing is, it's not purpose built, and it doesn't even appear to be if you suspend your disbelief. The only reason you'd think it is purpose built is because "well, it can't be this short if it wasn't purpose built". But if you go over the manual, and find special built operators, please tell us what they are.

e.g., to compute an average, you can use the function avg:{(+/x)%#x} - with the exception of parentheses, every character has an orthogonal function. Similarly, the maximum subarray sum solution mss:|/0(0|+)\ ; and there are many others. And it's not just math stuff - http://nsl.com has lots of other examples of many kinds -- and most importantly -- is an operating system + GUI not general enough?

How would you represent a graph and implement DFS in K?
One benefit to a short program is that there's not much code to rewrite if you can't read something.

This doesn't happen very often, but I find the thought comforting.

It all depends on how you define "small and fast".

Comments obviously are not code, so it's reasonable to complain about lack of comments.

You suggested wordcount, I think wordcount is good, so it's reasonable to complain about single letter words rather than descriptive words.

uberalex's suggestion for reformatting wouldn't change the algorithm or speed. It would simply spread operations across more lines. That also seems like a reasonable thing to ask, to me. They can learn your method either way.

Edit: I mean, I'm sure fitting more on the screen is valuable, but people already know how to fit many times as much code onto a screen. They avoid it on purpose for whatever reason.

>I mean, I'm sure fitting more on the screen is valuable, but people already know how to fit many times as much code onto a screen. They avoid it on purpose for whatever reason.

I think this reason (whatever it happens to be) is probably wrong.

I don't understand it either. But it happens everywhere, not just in code. Most people are only interested in the "truth" and "facts" as long as it fits within their existing world view.

And things like K rarely do.

Hey, just read bytecode, then. As small and fast as you can get.
But, is number of lines a particularly good size measurement?

Is there evidence one way or the other on whether it's better to measure size with, say, number of lines, number of tokens, or number of nodes in a parse tree? or something else?

My understanding of the literature is that no one has found a better way to measure program complexity than lines of code. In particular, the fancier metrics (cyclometric complexity and so on) don't add any value over simple LoC.

We've debated the merits of counting tokens before, but I don't recall anyone mentioning a study about it. In real programs—i.e. when you're dealing with idiomatic code as opposed to something designed to game a metric—I doubt that LoC, lexical length, and number of tokens differ much.

Scrolling doesn't bother me, but unnecessary code does. So long as I can see the algorithm on the screen that's fine. Love the kOS idea, keep on it!
Actually it is a good metric, certainly to the first order. Yes, you can have a line or three that is more complex than the rest but practically it isn't going to reduce the line count that much.

And token counts don't help as code that insists that each brace must be on its own line detracts from readability. For one thing it pushes the last bit of the function off the bottom of the screen meaning you have to scroll.

A line that is overly complex is eventually get rewritten.

I say this as someone who has written large bodies of code in sigma 5 assembly, Fortran II and IV bliss 36, C, C++, and Lisp. Perhaps more to the point, these days I read large bodies of code measured in millions. Lines of code dictates how long it will take to understand it.

Peter Norvig in paip gives some examples of small code and how it can be exceedingly clear.