Hacker News new | ask | show | jobs
by deadgrey19 2530 days ago
Thinking in math first has been the catch cry of functional programmers (and their formal logic/verification friends) for decades. And there's nothing wrong with it, unless the problem you are trying to solve actually requires performance. Then, you have to think in "system" first. For example: Write a program that captures network packets and stores them to disk as fast as possible. There's no maths to think of here. The complexity is all in the "fast" part, and to solve that, a deep understanding of the architecture of the system is necessary. Fancy algorithms (maths) will not help you here. e.g. Will compression help? Depends on the properties of the system. Will batching help? Depends on the system. Will threading help? Depends on the system.
4 comments

What you're describing is a very limited view of math, resembling the general public view of math as being algebra, trigonometry, geometry, and calculus; that is, all the math people are exposed to in secondary school. Look further and you'll see disciplines such as mathematical logic, combinatorics, and graph theory, without which you wouldn't have networking or binary or computers at all, really.

I don't know what you mean by "fancy algorithms", but all algorithms that run on your computer have a basis in math.

As for making things go as fast as possible, well that's a special case of the field of optimization, another mathematical discipline.

What I'm discussing here is the general concept of modelling your program formally before writing it (as per the article). What I'm arguing is that this type of approach is only possible for a certain set of applications which take the form of y = f(x), where f(x) is some type of data transformation /computation operation (e.g. calculate the GCD of these ints, find the shortest path through a given graph, sort this set etc). There's another set of applications which I/O bound, are very important, and yet, the "computation" that they preform is limited to none. These applications are rooted in, and bounded by system parameters, like understanding how disks work, how network cards work, how CPUs work etc. This is an optimisation problem, but not one that can be modelled mathematically (in any reasonable way) because of the vast complexities of the system. Building a TLA+ "proof" of this system will reduce trivially to x = x, and yet, the system is still important, and difficult to write well.
> this type of approach is only possible for a certain set of applications which take the form of y = f(x), where f(x) is some type of data transformation /computation operation (e.g. calculate the GCD of these ints, find the shortest path through a given graph, sort this set etc)

These days I'm trying to be mostly an embedded guy, and 100% understand what you're talking about re: problems that don't lend themselves well to mathematical modelling. Figuring out that your SPI bus is going slow because you've got the wrong multipler in a clock domain isn't a math problem :)

What I'd like to add to your y = f(x) examples though is that many Business Problems can (and probably should!) be modelled as y=f(x) type problems. I've seen a ton of business logic over the years that modifies objects in a pretty ad-hoc manner and is incredibly hard to reason about, especially in the big picture. The vast majority of the time, those problems can be modelled roughly as:

  new_state = f(old_state, event)
When you start modelling the business problems more formally like that, you can start using things like TLA+ to do model checking and find gaps in your formulation of the problem. Maybe you've got an state/event pairing that you haven't thought of. Maybe there's a way to get a model into a state that it can't escape from. TLA+ is useful for a lot more than verifying "calculate the GCD of these ints, find the shortest path through a given graph, sort this set", and I want to make sure people reading this don't write it off as a mathematical curiosity.

I've done a few embedded implementations that had pretty complicated state machines under the hood (off the top of my head, a LoRaWAN implementation). I modelled the states in TLA+, and it was a wonderful platform for discovering the flaws in the model I'd put together. It took a couple iterations before the model checker was happy, and from there the implementation was mostly mechanically translating my TLA+ model into code. There was some housekeeping stuff to keep track of (the TLA+ model was an abstraction), but it pretty much worked first try.

> that's a special case of the field of optimization, another mathematical discipline.

I'm not sure how you can claim that the entire field of optimization is a "mathematical discipline". Algorithm analysis is, I suppose, but most other practical optimization work has little if anything to do with math.

When I've spent time doing optimization work, it has often involved things like:

* Discovering some API I'm using is particularly slow and finding an alternate one that's faster.

* Adding caching.

* Reorganizing code to take advantage of vector instructions. (Well, I haven't personally done this, but I know it's a think many others do.)

* Reorganizing data to improve CPU cache usage.

* Evaluating some things lazily when they aren't always needed.

* Making objects smaller to put less pressure on the GC.

* Inlining functions or switching some functions to macros to avoid call overhead.

* Tweaking code to get some values into registers.

I'm not sure why you're being downvoted, but I agree with all of your points. Those aren't things that really lend themselves well to mathematical modelling. But... there is a huge field of math that does apply to this: statistics.

The first two cases are somewhat special:

- It may be immediately obvious that an API is terrible, and that the replacement is not. If API 1 takes 1 sec to call, and API 2 takes 100ms to call, easy choice without stats.

- Caching can be dangerous. While not really a stats problem, you do need to have a really solid model of what is getting cached, and how to know when to invalidate those cache entries.

For the rest of the examples you provided, you're making changes that may make the problem better, may have no effect, or may make the problem worse. You absolutely need to use statistics to determine whether or not changes like those are actually having an effect. Performance analysis is part math and part art, and without the math background, you're likely going to be spinning your wheels a bunch. Beyond stats, fields like queuing theory are going to make a huge impact when you're doing performance optimization in distributed systems.

What you're describing is not optimization the math field, and second _some_ of the example do have basis in mathematics.

> Discovering some API I'm using is particularly slow and finding an alternate one that's faster. On it's own it has nothing to do with Math, but writing code as components/services/abstract layers with well-defined boundaries/interfaces/types mean it's easier to reason about the code and avoid bugs. Implicitly here I'm saying we should use a language that has strong type support.

> Adding caching. This is memoization and without the basis that general code functions should behave like their math counterparts this is hard to reason about.

> Reorganizing code to take advantage of vector instructions These are mapping operations that are well defined in functional languages. The vectorized interface numpy provides is an abstraction of maps.

> Making objects smaller to put less pressure on the GC This is orthogonal to the actual math basis for the code. For example using enums over strings is a localized change.

You don't need fancy math because someone else did it for you, but if the industry is to compete and advance, it's going to need all sorts of people. The fact that you don't have to juggle every mathematical complexity underlying your domain is a success of human cooperation and compartmentalization. It's certainly not free or inherent.
How so? Can you give an example?
For example, if you do end up needing compression, are you going to write you own compression library or use an existing one somebody else already wrote?
I think you're missing the point. I'm talking about the concept of mathematically modelling the program first, before writing it (as per the article). This process only applies to certain types of applications, ones that are rooted in algorithmic transformations (like compression), rather than in system and I/O operations, like copying a packet from a network RX buffer into a disk block, in the most efficient way.
I agree about thinking "in system", but your "as fast as possible" comment is interesting, because I think this actually highlights the kind of problem that mathematical thinking is good at fixing.

Why "as fast as possible"? Usually when programmers say this it's because they think "the faster, the better!". But obviously there is some speed beyond which there's no discernible improvement. At that point, pursuing the as-fast-as-possible mandate at the expense of other concerns is the wrong thing to do.

Therefore, there's an assumption built into this statement that the system will never be that fast, and therefore "as fast as possible" is a good target. Needless to say this isn't a safe assumption.

The thing is that we're built to be really bad at knowing when we're making assumptions. Thinking mathematically is to some extent a way of trying to overcome that limitation.

This article is not about formal verification.

Why would mathematical thinking involve rejecting physical realities? If there is a performance constraint you are trying to optimize, account for it.

> Write a program that captures network packets and stores them on disk as fast as possible

How are you going to do that without formulas involving:

- network bandwidth - disk bandwidth - SATA bandwidth - packet ordering - compression time and ratios - amdahl's law

This sounds ripe for mathematical reasoning! Absolutely the way you model the problem is informed your knowledge of the hardware.