Hacker News new | ask | show | jobs
by geocar 4256 days ago
I disagree fundamentally.

I have noticed every page I scroll causes a comprehensive loss of around 90%, so in reading something that is 10 pagefuls long, I might only be able to produce a tiny part of the program.

Your milage may vary.

I find not scrolling, and just moving my eyes, I rapidly absorb the program, and I find most bugs just by reading the code. This practice is absolutely impossible for me if I have to scroll very far and made difficult by scrolling at all.

It is for this reason that I find simply counting the actual words to be an excellent estimate of complexity.

By the way: There are several temporary variables in that code; c:: creates a view called "c" which automatically updates whenever the dependent variables on the right side change.

2 comments

Yes, the research literature on software development has consistently found that code size is the best measurement of complexity and predictor of error rates. (Sorry I don't have citations handy but we've discussed this many times on HN, and there's a recent study in the book "Making Software" that adds to it.) What's interesting is how strongly this goes against what most people think they know about good programming and clear code.
I just had to troubleshoot a small helper app that took some HTTP input and wrote to a DB. The code was in C# and had about 10 files spread over 3 namespaces, plus a separate test infrastructure project. All sorts of factory models were used to setup an "HTTP pipeline" and authentication modules. The problem I had to fix: after a server upgrade, authentication was broken.

After digging around for a while, I discovered there was no bug. The partner's client code had the auth disabled, and the pervious server was misconfigured to not require auth. All which would not have been a problem if the system just did an "if headers.auth != "Basic ..." - but buried in this forest of stuff, it was overlooked.

It seems that some developers just love their edifices. They build all this "infrastructure", expanding code by an order of magnitude or more. It's considered good and robust and so, so much writing online is dedicated to this pursuit. I think it gives those programmers a feeling of import, as if they're really architecting something, not just pushing a few form fields around.

Even on the line by line basis, it's shocking how they love verbosity. Type inference? Nope, that makes things too compact and hard to read. Higher order functions to wrap up common patterns? Too difficult to understand. I'm not sure if developers simply lack the tiny bit of extra intelligence, or if they've tried it and honestly concluded that overflowing verbosity is the key to readability. Either way, it's sad, and holding back progress slightly.

Right, there seems to be a group thinking like this and a group aggressing against it and vice versa. I recently had a discussion about it and the 'architecting' bunch (we need 20 layer deep directories with 1000s of file with < 10 lines / file) keep shouting about maintainability. The problem is, that after 25 years of professional coding in many different circumstances, I see that most good programmers are much quicker to understand the 'non architected' (putting between '' because good code is not gibberish, it is architected but not by randomly generating design patterns and applying them) and the not so good programmers say that the 'architected' code is much more maintainable but take weeks or months longer to do anything worthwhile as they are 'grokking the architectural choices'.
If someone produces smaller and faster code than me, then I should want to learn from it. I wonder why other people have the exact opposite reaction.

Why do you think that is?

I think that "It's what I'm used to." is the main reason - intellectual comfort zone.

Having learned BASIC, FORTRAN and Pascal, C seemed like line noise - at first. As did PERL. And then k.

Btw, COBOL seemed "too verbose".

Once I actually started writing many k programs and then reading even more of them, I was able to recalibrate for the abstraction/density. I moved my intellectual comfort zone. Ironically, I was already there with mathematics. However, programming languages were different :).

Now, as a result, every time I have to read Java, I suffer from a kind of fatigue - having to read way too much code to glean the writer's intent. I just want them to get to the F'ing point.

N.B. - Mathematical literature/writing went through this same transition during the Renaissance. Equations were described in natural language (not unlike COBOL). A simple polynomial could require a paragraph of text to describe.

I'm not sure -- I know that after the fourth or fifth time solving a problem on projecteuler.net in 20 lines of code and seeing someone post a 1-line J/K solution, I went and downloaded J. I even managed to solve a few euler problems with it, which I regard as a large accomplishment for a novice. I like to tell people I've written a whole twenty or so lines of code in J!
I do enjoy learning about such things, but, for most of the work I do, performance is nowhere near at the top of the list of things I care about. Also in the past I've been burned by code that's small/fast but is otherwise utterly unmaintainable. I'm not saying that's the case here, but... past experience, and all that tends to color perceptions.

I think with a language like k or q, which appears to be purpose-built for certain types of problems, people look at it and get easily confused and discouraged because it's so different from all the more mainstream general-purpose programming languages they're used to. And it's a lot easier to put down something you don't understand than to admit you don't get it, or to spend lots of time learning something that may not be of much use to you. Kinda sucks, but it's often human nature.

> I think with a language like k or q, which appears to be purpose-built for certain types of problems,

The thing is, it's not purpose built, and it doesn't even appear to be if you suspend your disbelief. The only reason you'd think it is purpose built is because "well, it can't be this short if it wasn't purpose built". But if you go over the manual, and find special built operators, please tell us what they are.

e.g., to compute an average, you can use the function avg:{(+/x)%#x} - with the exception of parentheses, every character has an orthogonal function. Similarly, the maximum subarray sum solution mss:|/0(0|+)\ ; and there are many others. And it's not just math stuff - http://nsl.com has lots of other examples of many kinds -- and most importantly -- is an operating system + GUI not general enough?

How would you represent a graph and implement DFS in K?
If you are really interested, http://nsl.com/ is a treasure trove - quite a few of the examples are extremely well documented, some or not, but there's a wealth of information there.

Specifically about graphs, you can look at:

http://nsl.com/papers/order.htm - topological sorting

http://nsl.com/k/tarjan.q - strongly connected components

http://nsl.com/k/loop.q - find loops in graphs

I think in all of these the graph is represented either as a list of edges or a dictionary of node->(list of nodes that it has edges to)

Numbers in arrays can be treated as pointers, into that same array, giving a graph. It's just a question of context. If you were storing RDF triples in K you'd simply have an array for subjects, one for objects, one for predicates, and one of the URIs/text. Simply store the index of the URI/text item in each of the subject, object and predicate columns. An individual triple would be formed by the same index applied to the subject, object and predicate arrays.

DFS is then a variation of the more familiar functional style of tackling the problem where you have your end condition (i.e. something that matches what you're looking for) and failing that do something else (typically recursion).

I can't recall enough of the K syntax these days to actually implement that right now though, or if K has TCO.

One benefit to a short program is that there's not much code to rewrite if you can't read something.

This doesn't happen very often, but I find the thought comforting.

It all depends on how you define "small and fast".

Comments obviously are not code, so it's reasonable to complain about lack of comments.

You suggested wordcount, I think wordcount is good, so it's reasonable to complain about single letter words rather than descriptive words.

uberalex's suggestion for reformatting wouldn't change the algorithm or speed. It would simply spread operations across more lines. That also seems like a reasonable thing to ask, to me. They can learn your method either way.

Edit: I mean, I'm sure fitting more on the screen is valuable, but people already know how to fit many times as much code onto a screen. They avoid it on purpose for whatever reason.

>I mean, I'm sure fitting more on the screen is valuable, but people already know how to fit many times as much code onto a screen. They avoid it on purpose for whatever reason.

I think this reason (whatever it happens to be) is probably wrong.

I don't understand it either. But it happens everywhere, not just in code. Most people are only interested in the "truth" and "facts" as long as it fits within their existing world view.

And things like K rarely do.

Hey, just read bytecode, then. As small and fast as you can get.
But, is number of lines a particularly good size measurement?

Is there evidence one way or the other on whether it's better to measure size with, say, number of lines, number of tokens, or number of nodes in a parse tree? or something else?

My understanding of the literature is that no one has found a better way to measure program complexity than lines of code. In particular, the fancier metrics (cyclometric complexity and so on) don't add any value over simple LoC.

We've debated the merits of counting tokens before, but I don't recall anyone mentioning a study about it. In real programs—i.e. when you're dealing with idiomatic code as opposed to something designed to game a metric—I doubt that LoC, lexical length, and number of tokens differ much.

Scrolling doesn't bother me, but unnecessary code does. So long as I can see the algorithm on the screen that's fine. Love the kOS idea, keep on it!