Hacker News new | ask | show | jobs
by Refefer 4514 days ago
An important realization I made a while back was that design methodologies do little to address program correctness, which is almost always the wildcard on deliverables; buggy software means missed deadlines and budget. Some, such as TDD, work to address the rapid building of tools to a particular spec, but often fail to promote static guarantees, especially in languages and environments where such provability is largely impossible. Dynamic languages penchant for monkey (guerrilla) patching further exacerbates the problem.

Solutions to this are tough. My first suggestion would be to use languages which facilitate correctness, although it's usually at the expense of developer availability: the pool of engineers with experience and know-how in true FP is orders of magnitude smaller than more pervasive languages. My second thought is to further embrace math as the building block for non-trivial applications: mathematical proofs have real, quantifiable value in correctness. I find it no surprise that the larger companies have made foundational maths, such as category theory and abstract algebra, the underlying abstraction for their general frameworks. This is even a tougher pitch than the first since most engineers don't recognize what they're doing as math at all - a big part of the problem. So many of us are doing by feel what has already been formally codified in other disciplines.

I'm aware that both require more (not necessarily formal) education than most engineers have pursued and makes it a difficult short-term pitch point for any company, but I think if we're serious about eliminating sources non-determinism from projects, it's important we address them directly.

2 comments

I think that's one aspect of the problem, but the migration from make-do to mathematically rigorous code can be equally fraught with peril. While the code itself can be made predictable, and easy to reason around, the time estimates and project planning often cannot. There is never enough time to factor out all of the commonality, remove all of the unnecessary use of state, codify all the assumptions into data types, etc., so you have to pick your battles.

A programmer needs intuition of what the biggest, most effective improvements are on the code base, which allow them to get the most work done. They also need some ability to guess how much time it will take, so that they don't miss deadlines. No amount of static type analysis will fix that.

Ah, but that's a matter of design - except now we have strong constructs from which to consider our problem. We will never get away from developing the architecture of our system, which is cost dependent on how well understood the domain is. Ideally, that's where we should aim to move: problem specifications that render implementations rote. A lofty goal, I know, but within closer reach every day and possible in many environments already.

Part of the beauty of proof is that so long as it is correct, the individual lemmas are largely irrelevant: we don't necessarily need to remove commonality or statefulness. Of course, I'm purposefully glossing over extra-functional requirements on which, outside of big-O, we don't have a firm grasp.

I'm attempting to stay away from "effective" or "most work done" since they're ill defined and highly subjective but rather focus on measurable changes. I'd argue that, amortized, the upfront costs of better understanding the problem definition results in cost savings down the line, especially as it recedes into maintenance.

The dark side proof is that most mathematical problems are incredibly contrived. Taking a business requirement and translating it into known mathematical problems is harder than it sounds. Once you can do that, you're 90% of the way there, and people who can do that reliably are regarded as geniuses.

But most of the time, development is done without full understanding of the problem space. Usually, the problem doesn't become fully understood until you've already spent a good amount of time coding up your solution. If you wait until you fully understand the problem before starting, then you'll never start, because your brain simply can't comprehend the entire scope.

So instead, you get code bases full of sort-of well-factored code, but with lots of unwittingly reinvented wheels. This status is occasionally improved when somebody really smart happens to notice the commonality, and remembers a classic problem that it resembles, and manages to refactor the entire thing using the general solution. However, this almost never becomes apparent during the first revision.

"I find it no surprise that the larger companies have made foundational maths, such as category theory and abstract algebra, the underlying abstraction for their general frameworks."

I would like to learn more; do you have a specific example?

Having worked with real "engineer" engineers, I've found that they have, and value, a considerable amount of mathematical education, but that education is all in continuous mathematics; abstract algebra and formal logic have about the same amount of respect as basket weaving. Unfortunately, continuous math isn't particularly useful for software.

As an electrical engineer who briefly flirted with computer engineering, I had 4 semesters of continuous math and 1 semester of discrete math[1]. I also had classes like linear systems and electromagnetics where I had to actually use continuous math heavily.

While I do not think continuous math is particularly useful for general software development, I think it is very valuable for specific problem domains. I have found my Calculus/DiffEq foundation to be very valuable for my work with radar signal processing and NLP code, more so in the former because it was basically translating electrical engineering algorithms, formulas, and concepts into C. It is also important for any type of development that makes heavy use of geometry.

As a side note, I saw some of the bias you describe out of the more "pure" EEs I worked with when I first started. There was a strong bias against software engineers, particularly those who went the CS route, because they didn't understand the math and physics behind the hardware. Admittedly, some were clueless and probably should not have been writing software for multi-million dollar hardware[2]. Most were competent, though, and able to pick up the basics they needed to know when tutored for a bit.

[1] Which was actually titled "Discrete Mathematics", and was just covered basic set theory, combinatorics, and linear algebra. [2] Like the one who added a reset routine that blindly opened power contacts on the UUT without verifying that the power supplies were off first. Fortunately, that was caught before they actually opened with hundreds of amps going through the bus.

As you say, continuous math (to my mind, calculus, diffeq, and linear algebra; anything involving reals) is necessary for some problem domains. But accounting is necessary for some problem domains as well. And molecular biology.[1]

But if I get worked up into a good froth, I can make a case that software development is applied formal logic or applied abstract algebra (or both). I don't believe you can do professional software development (in Weinberg's sense) without some serious discrete math, in the same way you can't do signal processing without calculus.

[1] If you've got something that mixes the three, let me know. It's probably something I should stay away from.

I must admit to ignorance of Weinberg's books and other writings. My interest is piqued now, though.

That said, based on your second statement nearly all scientific and engineering programming would not qualify as "professional software development". The code I worked on had little to no discrete math or formal logic in it. There was not an integer to be found save loop counters and array indices Do you not consider an (electrical engineer | mechanical engineer | physicist | molecular biologist) who can code and spends the vast majority of their time writing production code like this a professional software developer?

Two good examples from the Scala community would be Algebird[1] from Twitter which uses Monoids as a main abstraction and Spire[2], which might be the best numerical library out there and heavily rooted in Abstract Algebra.

1. https://github.com/twitter/algebird 2. https://github.com/non/spire

I considered editing my other comment but decided instead to break it out.

There are a couple of complexities that your comment illustrates well:

First, continuous math _is_ available and immediately applicable today. The problem is that we often reason in and program to the implementation, not the abstraction - a subtle difference, but an important one. Not only that, but by reasoning in a flawed representation, we often miss important derivations that result in dramatic simplifications and reductions in the problem domain. I would also argue that we already do use continuous math regularly - for example, linear algebra, combinatorics, and set theor: most of us only know them as arrays, random, and SQL.

Secondly, not enough effort is made in formal education for applying 'pure' math to computer science. Some branches, such as linear algebra, have obvious implementations and analogies already available but others are quite a bit less clear - I fault this more on curriculum silos than an engineer's innate abilities. It's a learned skill that just isn't often taught.

This might be beside the point, but combinatorics, set theory and algebra are prime examples of discrete mathematics, not continuous.