Hacker News new | ask | show | jobs
by rixed 3189 days ago
Figure 2 suggests another possible bias favoring functional, managed languages: a lot of errors for C/C++ are related to concurrency and performance. But those are mostly non-bugs for other languages, since when concurrency or performance are a requirement then most of those studied languages would not be considered anyway.

It seems similar to the paradox that makes the best medicine appear to have a lower survival rate just because it's given to most serious patients.

4 comments

Rephrasing to make my point clearer: you open a performance bug against, say, a C program more often than against, say, a python program, because performance is more likely to be a requirement of a C program than of a python program.

Similarly, again for performance reason, your average C program will have more concurrency than your average python program, therefore also more bugs.

Another way to put it: you use C when you have a complex problem to solve and python when you have a simple problem to solve (unless you are a masochist or a purist I suppose). So, one reason those languages have more bugs may just be that the programs themselves are more prone to errors (which might only be slightly related to size, if at all)

>when concurrency or performance are a requirement

I see what you're getting at, but this is an irksome way to put it. We're clearly unwilling to wait for the heat death of the universe for our programs to terminate. Performance is always a requirement.

"When the performance requirements can only be met by C/C++" might be a more accurate formulation, but then it's just tautological.

Java, Go, Obj-C, Erlang, and Scala are all certainly in the running when concurrency is required, and fit within many latency budgets just fine. The managed and dynamic languages on the list are typically used in contexts where latency is dominated by network and disk I/O, so marginal CPU efficiency isn't worth much. That doesn't mean performance isn't a requirement, it means the most effective ways to increase performance are different. Adding indexes, optimizing queries, caching, etc.

I think its more like a level of performance is a requirement. Once you need a higher level of performance C/C++ becomes one of only a few tools you can use. If you need higher then you either go to Fortran or ASM.
Where do you see this in the paper? They say concurrency errors are mostly the usual things like deadlocks and race conditions, but those absolutely do exist in every language.

Also, what do you mean most of these languages wouldn't be considered when concurrency is required? Concurrency is bog standard everywhere.

It seems like the way the define a bug, a performance bug would be a bug relative to expectations, per project, so you can definitely have a performance bug in Go or Haskell, for example, if something works slower than developers think it should (as opposed to being slower than some external reference code or something). So maybe it's closer to something like "developer control over unexpected underperformance"?

Not even every language in that study supports concurrency, as the study itself points out. I hear a lot of praise for Go because of how much people like doing concurrency with it. The fact that they observed a higher rate of concurrency bugs in Go could just as easily support the interpretation that Go is good for concurrency as it does the interpretation is bad for concurrency.
Since Go makes concurrency easier and encourages its use, there’s going to be a lot more concurrency bugs. By contrast languages like Python don’t even have proper parallel threads, so fewer people will write concurrent python programs and fewer bugs will arise. This is a confounding factor found in one sort or another throughout the survey.

It’s good that they did this research but unfortunately they couldn’t account for everything.

They talk about this, and did do some sort of things to account for it a little. That's why the conclude that more so then overall defect, languages are more correlated to categories of defects.
Some languages (like Clojure) very significantly reduce the possibility of thread-related bugs. Clojure in particular was designed with multi-threading in mind, so I think it is a fair point that some languages will have more trouble with this than others.

Try multithreading in C++ vs Clojure and the difference in amount of effort is well beyond trivial.

I disagree. Functional languages are well suited for concurrency.
I think so too! But the operative question is not how well suited they are to the task. It's how often they're used for the task in the corpus. And on that point I suspect that the person you're replying to has the right of it. I suspect that reaching for concurrency is correlated with a desire for high performance, which in turn I suspect causes people also to reach for these lower level languages.