Hacker News new | ask | show | jobs
by crdrost 2305 days ago
You're describing some sort of type mismatch between two concepts I think, and I really don't understand it: and I feel like as someone who occasionally teaches these things I really would like to. Could you elaborate?

For my part, I do like to think of adjoining numbers onto an existing system, but that immediately becomes matrices.

So you decide to adjoin an ε such that ε² = 0. Your numbers are now vectors (a, b) and the action of ε is to map this to (b, 0) so that it is represented by the matrix

    [ 0, 0 ]
    [ 1, 0 ]
And thus the number a + b ε is perfectly encoded in the matrix algebra as a I + b ε =

    [ a, 0 ]
    [ b, a ]
This sort of trick is really helpful for programmers because we often have a field of bigints or rationals that we want to adjoin an irrational number to. So for example when you want to do Fibonaccis using exponentiation by squaring, it helps (although this is not obvious at first) to adjoin the golden ratio φ satisfying φ² = 1 + φ to your bigints, so that your numbers look like a + b φ =

    [ a,   b   ]
    [ b, a + b ]
Then F_n = [1, 0] . φ^(n+1) . [1, 0] and you can exponentiate by squaring straightforwardly. So you start from [0,1;1,1] and square that to [1,1;1,2] and square that to [2,3;3,5] and square that to [13,21;21,34] and square that to [610, 987; 987, 1597], so you get to skip ahead past 55, 89, 144, 233, and 377, at the cost that multiplications are slower than additions but also you potentially get to allocate less memory.

When you adjoin to the reals an i such that i² = -1 these matrices have the shape of the 2D scaled rotations, so that is what complex numbers just “are” to me. The representation as a tuple is to me the same as representing the above matrix as (a, b) or (a, b, a+b) to save time or space... The whole thing is a matrix but the entries are indeed redundant and so you don't need to store all of them at once.

So that's why I do not understand what you mean by these being separate concepts. Can you elaborate?

1 comments

I'd like to understand what you're doing, but I'm missing something.

1. Adjoining ε such that ε² = 0.

If the numbers have the form a + bε, isn't the action of ε to map (a, b) to (0, a)? Did you mean to say that ε = [0,1; 0,0]?

2. Adjoining φ = (1 + √5)/2 = [0,1; 1,1].

I'm fascinated by the idea of determining that a 2x2 matrix is equal to a real number.

I see that if you take successive powers of the real number φ, and express them in the form aφ + b, the coefficients a and b will take on values from the Fibonacci sequence. So far so good.

I don't follow the claim that the matrix [0,1; 1,1] actually represents φ. This would imply that the formula F_n = [1, 0] · φ^(n+1) · [1, 0] means that F_N equals the real number 1 (= 1 + 0φ), times φ^{n+1}, times 1 again. But this isn't true. It certainly is true that [1,0][0,1; 1,1]^{n+1}[1,0] is equal to F_n, but I don't follow the interpretation as adjoined numeric values (as opposed to as coefficients of φ).

3. Equivalence of representations [a,b] and aI + bM, where M is a 2x2 matrix representing any adjoined number.

This looks to me like the claim that if [a,b]M = [c,d], then (aI + bM)M = cI + dM. I tried to work this out algebraically and I'm pretty sure it isn't true in general. I'm open to being told that I'm wrong about this. Have I interpreted the idea correctly? What are the conditions under which the equivalence holds?

(1) This gets a little into difficult notation but e.g. in Mathematica-style notation,

    {{0,0},{1,0}} {{a},{b}} = {{0}, {a}}.
I think that you are preferring to left multiply your matrices by your points so that your points remain horizontal, whereas I am just used to my vectors being column vectors that I write as points sometimes? So that is why we are getting transposes of each other's notation.

(2) So the case for φ is very similar to the case for ε: you want to start with the constitutive relation, in this case φ² = φ + 1, to build the matrix with first column {{0},{1}} (multiplying 1 by φ gives φ) and the second column {{1},{1}} (the above constitutive relation).

The isomorphism is then that if X represents this adjoined unit then the point {{a}, {b}} becomes a I + b X one way [or say if you have a cubic constitutive relation then {{a},{b},{c}} becomes a I + b X + c X² similarly] and M becomes M {{1},{0}} [or say M {{1},{0},{0}} etc].

I would definitely agree that this is probably a much narrower statement than your “if [a,b]M = [c,d], then (aI + bM)M = cI + dM” as the structure of X is very tight. It is always [e_2 e_3 ... e_n c] where e_i is a unit column vector with a 1 in the i’th place and 0s everywhere else, and c is the vector embodying the constitutive relation.

But if you want a condition, the isomorphism condition is probably the best place to go.

So the claim is that φ^n = F_{n-1} + F_n φ.