Hacker News new | ask | show | jobs
by dataflow 1326 days ago
Fun fact, there are some other lessons here: it can sometimes pay off to (1) generalize your function, and (2) respect the mathematical axioms you're supposed to be following. This (obviously) isn't to say you should always generalize everything, but you should at least consider what would happen if you did so, and if the difference is small, perhaps do it. The benefit of doing so being that it can avoid problems that aren't otherwise obvious—sometimes by design, sometimes by accident.

In particular, (x + y) / 2 is the wrong implementation of midpoint in general, because it would fail to even compile on objects you can't add together. But midpoint is well-defined on anything you can subtract (i.e. anything you can define a consistent distance function for)—and it doesn't require addition to be well-defined between those objects!

One obvious (in C/C++, and not-so-obvious in Java) counterexample here is pointers/iterators. You can subtract them, but not add them. And, in fact, if you implement midpoint in a manner that generalizes to those and respects the intrinsic constraints of the problem, you end up with the same x + (y - x) / 2 implementation, which doesn't have this bug.

4 comments

Interesting. Another example is datetimes. You can't add datetimes. You can add a datetime and a time delta, and the difference of two datetimes is a timedelta.

I guess in maths this is called a generating Lie algebra (maybe someone can comment on this?)

I think the concept you are looking for is a ["torsor"](https://en.wikipedia.org/wiki/Principal_homogeneous_space).

Basically,

1. You have a 0 time delta, and you can add and subtract them satisfying some natural equations. (time deltas form a group)

2. You can add time deltas to a datetime to get a new datetime, and this satisfies some natural equations relating to adding time deltas to each other (time deltas act on datetimes).

3. You can subtract two datetimes to get a time delta satisfying some more natural equations (the action is free and transitive).

The term I was looking for was affine structure, as I commented to someone else. But from your link, which I can't understand entirely, I get the sense that a torsor is an even bigger generalization.
> The term I was looking for was affine structure, as I commented to someone else. But from your link, which I can't understand entirely, I get the sense that a torsor is an even bigger generalization.

An affine space is a torsor under a vector space, and you can have instead a torsor under any group. This loses a bit of structure, in the sense that you can take convex combinations in an affine space but not in an arbitrary torsor; but otherwise it is a proper generalisation. But the convex combination $(a + b)/2$ used to obtain a midpoint is exactly what we want here!

Indeed, torsors have exactly the properties you describe, but notably not the ability to find the midpoint between two points (that would involve extracting square roots in a group, which is not guaranteed possible, or uniquely defined when possible).
And in fact finding the midpoint is not possible half the time in the space we're interested in (https://news.ycombinator.com/edit?id=33497270). So what is the algebraic structure that underlies the binary-search algorithm, since evidently it isn't really the torsor of a group?
> So what is the algebraic structure that underlies the binary-search algorithm, since evidently it isn't really the torsor of a group?

Though it pains me to say so as an algebraist, I think that it probably just isn't a problem most usefully modelled with a more abstract algebraic structure. Although it would be easy to cook up a structure permitting "division with rounding" … maybe a Euclidean domain (https://en.wikipedia.org/wiki/Euclidean_domain) is something like the right structure?

I'm not an algebraist! So maybe what I'm about to say is dumb:

As I understand it, the correctness of any correct algorithm is conditioned on some requirements about the data types it's operating on and the operations applicable to it. Once you formalize that set of requirements, you can apply the algorithm (correctly) to any data type whose operations fulfill them.

But some sets of data and some operations on them that fulfill some formally stated requirements are just an abstract algebra, aren't they? Like, if you have two data types {F, V} and operations on them {+, ×, ·, +⃗} that fulfill the vector-space axioms, they're a vector space, so any vector-space algorithm will work on them. So every algorithm (or, more accurately, every theorem about an algorithm) defines some algebraic structure (or several of them), which may or may not be a well-known one.

For binary search you have two sets: the set of indices I, which is what we've been talking about, and the set E of elements that might occur in the array a you're searching through. You need a total order on E, and I think you need a total order on I, and the array a needs to be sorted such that aᵢaⱼ if i < j (though not conversely, since it might contain duplicates). You need a midpoint operation mid(i, j) to compute a new index from any two existing indices such that i ≤ mid(i, j) < j if i < j. And "mid" needs a sort of well-foundedness property on I to guarantee that the recursion eventually terminates; for any subset of ℤ you can define a "mid" that fulfills this, but for example in ℚ or ℝ you cannot, at least not with the usual ordering.

I don't think a Euclidean domain is quite the right thing for I because there's no obvious total order, and I think you need a total order to state the sortedness precondition on the array contents. Also, lots of Euclidean domains (like ℚ or ℝ) are dense and therefore could lead you to nonterminating recursion. And it isn't obvious to me that "mid" requires so much structure from I.

If you care about complexity you also have to define the cost of operations like ≤, mid, and array indexing, so your algorithm is no longer defined on just an abstract algebra, but an abstract algebra augmented with a cost model. But for correctness and termination that isn't required.

> I guess in maths this is called a generating Lie algebra

This is often called an affine structure.

This is the term I was looking for, thank you.
Not all metric spaces have midpoints (or unique midpoints) so it’s not true you can compute a midpoint any time you have a distance function (you are right you can define it but that’s kind of useless computationally since it doesn’t give you an algorithm).
If we're going the pedantic route, note that you don't need (and in fact half the time cannot have) uniqueness in our case anyway. There isn't really a unique midpoint for {0, 1, 2, 3}; both 1 and 2 are valid midpoints, even for binary search. We just pick the first one arbitrarily and work with that.

But note that that sentence was just about calculating midpoints, not about the larger binary search algorithm. And in any case, I was just trying to convey layman intuition, not write a mathematically precise theorem.

This should also be obvious after a bit of thought to anyone who has worked with timestamps, and is also well-known in e.g. animation where midpoint is just a special case of p=0.5.
Not sure about midpoint being well-defined on anything we can subtract.. Z and R are infinite.. there are a lot of values in there that don't compute.

To vary your point here, the axioms for twos complement and IEEE floating-point aren't well known or observed.

There are countably infinite turing machines and there is one for every element in Z. But there are uncountably infinite real numbers, so we’re out of luck for almost all of them.