Hacker News new | ask | show | jobs
by wallace_f 3151 days ago
>Especially areas like programming languages, in their applied parts, operate more through argumentation and analogy than through proofs.

I think the virtue is not in the value of the arguments, but whether the argumenrs are falsifiable.

Programming has the virtue that it has to work for it to have value. Social sciences and humanities have to convince someone that it would work.

1 comments

PLs arguments are very rarely falsifiable, if that's your criterion. I mean, yes, a programming language has to work in some sense, but that's a pretty low bar; C and Lisp work and have for decades, so we could just say we're done, disband the research field. The rest of PLs exists because they claim that they're building something that is in some way "better" than C or Lisp. But that's a pretty fuzzy argument, difficult to falsify. There is a small area of empirical software engineering that does try to measure things like whether certain constructs can measurably reduce bugs in real-world usage, which would be a testable hypothesis. But they've been able to establish very, very few solid things about PLs, and the vast majority of PLs doesn't look at all like that. It's more design argumentation.

To pick a concrete example: Rich Hickey introduced transducers into Clojure a while ago, using an argument, illustrated by a number of examples, for why they're useful. Is this argument falsifiable? In principle some version of it might be, if you made "useful" more precise (useful to whom? in which contexts? how would you know?). But the kind of empirical work it would take to measure it in a non-toy setting is quite difficult, so afaik nobody's tested it, or even really formulated the question precisely enough to test it. In practice, you accept or reject the construct based on what you think of the argument, or you try to find a counterargument that makes them look like an inelegant/awkward construct, but in either case you probably aren't attempting to rigorously validate or falsify a scientific hypothesis relating to them. Basically all of PL design and evolution looks more like that than like Popperian science...

I basically agree with you, but we're arguing to different ends.

What I am saying is that there is nothing wrong with these arguments or with using them usefully. What is wrong is when nothing is falsifible or able to proven, either. So in other words, arguments are useful, but it does matter if it has to agree with scientific experiment.

Programming does, at the end of the day, agree with fundmantal truths for it to work. Its foundations are on the metal, and everything is reducible experimentability.

Perhaps everything except the human element: most aspects of modern product development involve programming language improvements that have to do with improving human interaction with a computer. But even here, we have a sort of market for ideas in that developers who adopt better ideas will be more successful.

It seems fundamental that you're building artifacts for users. In user interface design there's no hard science, no falsification. An application that wins now might lose later when things have changed, and this can be based on fashion.

Of course, it has to work. But a sculpture has to "work" too in the sense that it shouldn't collapse. Structural integrity is only part of the goal. Similarly for programs.

> PLs arguments are very rarely falsifiable

You mean PL design, don't you? Performance work is very empirical. If my new type of inline cache is better then I need to prove that and it's falsifiable (using benchmarks, which I admit aren't ideal).

In fact! I can falsify your claim that PL research isn't falsifiable by using some PL research!

http://cis.upenn.edu/~cis501/papers/producing-wrong-data.pdf

> Of the 133 papers published in the surveyed conference pro- ceedings, 88 had at least one section dedicated to experimental methodology and evaluation

That puts it on par with psychology or sociology which has a poor reputation in the falsifiability department
I think most of the other papers have non-experimental proofs, so it's not that they do nothing rigorous.

And anyway the claim was 'very rarely', when it's actually a majority of cases.

Still, the benchmarks are usually toys and it's pretty easy to find contradictory benchmarking reports. Very few projects do online benchmarking comparisons of different design choices on real workloads