Hacker News new | ask | show | jobs
by rohern 4834 days ago
There is a lot of weak-ass criticism going on in this thread when the data -- whatever about its methodology is troubling -- seems to almost perfectly back up what is the common experience among programmers. Yes, copy-and-paste doubtlessly affected the numbers for JavaScript, but I am not at all surprised to see JavaScript where it is.

Does anyone here really doubt that you can get more done with a single line of Python than a line of C/Java/C++? Same for Clojure/Common Lisp/Racket versus Python.

We might not take individual ranking too seriously, and none of this affects language choice when performance is a critical concern (though the spacing between Scala, OCaml, and Go is interesting and relevant to this), but do you guys honestly doubt the trend here? Does anyone have a strong counter-example? It seems like the authors may have had a decent notion with using LOC as a measure. There is no proof of this here, but I am intrigued by it.

The final conclusions in favor of CoffeeScript, Clojure, and Python are again pretty obvious. Is anyone going to suggest JavaScript or C++ is more expressive than any of these?

8 comments

> There is a lot of weak-ass criticism going on in this thread when the data -- whatever about its methodology is troubling -- seems to almost perfectly back up what is the common experience among programmers.

So?

I mean, really, I can come up with completely bogus metrics all day, and whenever one produces results in a domain that happen to align with CW in that domain post a infographic using it, but that doesn't make that metric meaningful.

> The final conclusions in favor of CoffeeScript, Clojure, and Python are pretty obvious, I would think.</blockquote>

So? A metric that has no intrinsic validity doesn't become valuable just because it produces conclusions which match what you would have assumed to be true (whether based on valid logic or not) before encountering the metric.

Yes, thank you for that 9th-grade science lesson.

The commenters in this thread are writing off the data because...? They decided the measure is bad? When the measure conforms to experience, it's probably worthwhile to look into it. This doesn't mean that correlation implies causation and yada yada 9th-grade science lesson.

> The commenters in this thread are writing off the data because...? They decided the measure is bad?

Yes, because what the measure actually measures isn't a valid proxy for what it purports to measure.

> When the measure conforms to experience, it's probably worthwhile to look into it.

No, if the adopted proxy (here, "LOC per commit") has some sound rationale for being used as a proxy for the actual quality of interest (here "expressiveness"), then it is worth actually getting some results with it for which you have a firm expectation of what those results would look like if you were able to directly measure the quantity (in this case "expressiveness") for which you are using the proxy (in this case "LOC per commit").

If after such testing the proxy -- which you first looked to for reasonableness, and then tested on the "simple" data for which you had a firm expectation of what the results would be for the quality of interest -- seems workable, its worth investigating what kinds of results in returns for things which you don't have a firm idea of where they would fall. (Which is the only reason you actually use a proxy measure for in the first place.)

In this case, the proxy fails at the first test (sound rationale for using it as a proxy for expressiveness), which makes the second test (do the results line up with what you'd expect on a known sample set) meaningless.

Obviously I and the writer of the article disagree with you that it fails the first case.
> Obviously I and the writer of the article disagree with you that it fails the first case.

That's hard to tell in your case, since most of your commentary has been explicitly skipping past the criticism of the failure of the proxy to have a clear link to the thing it was taken as a proxy to say that doesn't matter since the results were about what you would xpect, rather than actually addressing the criticism.

So it sounds like you were failing to understand the first test more than you were disagreeing with the criticism based on it. And, as yet, you haven't stated any reason for disagreeing, just continued to skip to the second test.

The supposition is that a more expressive language lets you do more with a single line of code on average than a less expressive language. The second supposition is that commits tend to be done to gather code expressing a single chunk of functionality in a program, so that on the average commits have the same utility in terms of what they contribute to the source project.

It's clear from this, I would think, why therefore length-of-commit is supposed to be a good proxy for measuring expressiveness.

To be clear -- the reason that it is obvious that I and the author disagree with you on the first case is because your objection was a) an elementary one and a consideration important to all such investigations, therefore it would be considered by anyone doing such an investigation or analyzing one and b) we were disagreeing with you anyway.

If you only validate research via checking if it "agrees with experience", then what's the point of doing it in the first place?
That's not what we're doing here. What this article really is an examination based on a set of assumptions about how we can measure expressiveness. This is a difficult thing to measure. You could (I assume) do just as well by polling thousands of programmers and asking them in their experience, which languages are expressive. In the case of this article, the measure of expressiveness used seems to match up very well with a) common programmer experience and b) the intentions of language designers. And we're not talking about programmer experience in 2013. This split between Lisp, C, and Fortran is older than I am.

I do not see anyone offering better measures of expressiveness or suggesting counterexamples to invalidate the results. The criticism here is just "Meh, not impressed".

Vala and C# are two very similar langauges that are on polar opposite sides of the chart. Why? If I can't answer that in a convincing way, my first thought is going to be "because there is another factor involved in the rankings that wasn't accounted for."
I don't think anyone is saying it is completely uncorrelated with the real thing, sure if you split the chart in half the languages on the right will mostly really be less expressive, but this we know without having the chart anyway and the more granular results don't seem trustworthy.
This is exactly the point I am making.
There's enough good data backing up that conclusion that there's no point using crappy data like in the article.

Nobody will argue C is more expressive than Python, but the data in the article doesn't support it. Just because something is true doesn't mean it's okay to support it with shoddy data.

LOC per commit isn't a proxy measurement of the expressiveness of a language. The entire premise of the article is flawed.

The data in the article based on LOC seems to match very closely conclusions based on other data. I do not know that we get to throw out this measure just 'cuz. This is proof of nothing, but no one is offering proof that we should ditch this measurement.
I'd almost bet the author started with the conclusion and went searching for more data to back it up, so it's not a surprise to me that his data backs up his conclusion.

And nobody is offering proof that this measurement is meaningful, so it should be ditched.

I started with a hypothesis that LOC/commit might be an interesting way to compare the productivity of languages, and I frankly had no clue whether it would produce anything useful or interpretable at all. When it did, I figured it was cool enough to write it up, although it's definitely fairly noisy data.
I personally think that the poor methodology of this post would never have survived to see the light of day if the conclusions did not match what programmers expect. Conversely the methodological flaws mean that we should be very careful about accepting the data for any conclusion beyond, "Well, it looks like what I expect."
Fair enough. That is an entirely valid critique. However, I do not think this makes the article worthless. Given the apparent correlation between known expressiveness and the data derived here, it may be that they had a good notion using length-of-commit as a measure for expressiveness.
> it may be that they had a good notion using length-of-commit as a measure for expressiveness.

Replace "expressiveness" with "author's anticipation of reviewer difficulty based on prevailing cultural biases" and you have a different conclusion for how authors size groups of changes that also matches the order graphed.

If you want literal expression-per-line why not just look at compressed_size/line_count for the available body of work in each language?

I think this plot begs a different question... which languages are being abused by the development community?

Javascript is way too expressive for its given position. I also believe ruby is more expressive than python, and yet the plot shows the opposite there as well.

This plot could have some interesting data, but there's far too much noise to really learn much from it.

I agree. I was also surprised by Ruby's position. I suspect the problem is related to that with JavaScript. That is to say, people including code in their commits that they didn't write.
Actually, I think thats exactly the problem. There is a general perception based on anecdotal experience, followed by a non rigorous 'scientific' data 'experiment', followed by analysis of results which throws out all the data which disagrees with the original perception. Look at the actual results, they dont really show a strong correlation with 'common experience amound developers',
>>you can get more done with a single line<<

Maybe not if you follow PEP:8 -- maybe so if you write really really long lines ;-)

:D
>Does anyone here really doubt that you can get more done with a single line of Python than a line of C/Java/C++?

I've never understood this criticism before. Consider this line of python:

x = 3

To this line in C:

DoAllTheThings();

A single line of code is a bad comparison because it doesn't say anything about the underlying language or platform.

These are valid lines of code (plus or minus a ;) in both languages.
That is exactly my point.
Well, then your point is to say that comparing two animals on the strength of their legs is ridiculous because both animals have legs. This is just a non sequitur. No one was confused about the fact that you can make a function call in C that does more than a variable assignment in Python.

We know a single line of code can do a lot of things in both languages. The question is what does the average line do. And this is important, because the occurrence of bugs is directly correlated with lines written rather than the complexity of those lines, and the same seems to be true for programmer productivity. So if 100 lines of Python does more than 100 lines of C, this is an important fact, as in first case more will have been accomplished for the same amount of work and the same debugging effort.

No, my point is that comparing a single line of code is ridiculous. You can only get an understanding of the efficiency of a language when you look at many lines of code. I would think you would need a few thousand lines of code that did something non-trivial before the true picture emerges.

Not even when you come up with an answer will you be able to say something about a single line of code. Meaningful statements can only be made about lots of lines of code.

Except perhaps about vb. Screw that language.

Yeah, sure. That's what an average is. It is taking the performance of a large case and reducing it to a single unit in order to make comparisons.