| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by kragen 196 days ago

Probably not. The conventional math notation has three major advantages over the "[n]o superscripts or subscripts or [G]reek letters and weird symbols" you're proposing:

1. It's more human-readable. The superscripts and subscripts and weird symbols permit preattentive processing of formula structures, accelerating pattern recognition.

2. It's familiar. Novel math notations face the same problem as alternative English orthographies like Shavian (https://en.wikipedia.org/wiki/Shavian_alphabet) in that, however logical they may be, the audience they'd need to appeal to consists of people who have spent 50 years restructuring their brains into specialized machines to process the conventional notation. Aim t3mpted te rait qe r3st ev q1s c0m3nt 1n mai on alterned1v i6gl1c orx2grefi http://canonical.org/~kragen/alphanumerenglish bet ai qi6k ail rez1st qe t3mpt8cen because, even though it's a much better way to spell English, nobody would understand it.

3. It's optimized for rewriting a formula many times. When you write a computer program, you only write it once, so there isn't a great burden in using a notation like (eq (deriv x (pow e y)) (mul (pow e y) (deriv x y)) 1), which takes 54 characters to say what the conventional math notation¹ says in 16 characters³. But, when you're performing algebraic transformations of a formula, you're writing the same formula over and over again in different forms, sometimes only slightly transformed; the line before that one said (eq (deriv x (pow e y)) (deriv x x) 1), for example². For this purpose, brevity is essential, and as we know from information theory, brevity is proportional to the logarithm of the number of different weird symbols you use.

We could certainly improve conventional math notation, and in fact mathematicians invent new notation all the time in order to do so, but the direction you're suggesting would not be an improvement.

People do make this suggestion all the time. I think it's prompted by this experience where they have always found math difficult, they've always found math notation difficult, and they infer that the former is because of the latter. This inference, although reasonable, is incorrect. Math is inherently difficult, as far as anybody knows (an observation famously attributed to Euclid) and the difficult notation actually makes it easier. Undergraduates routinely perform mental feats that defied Archimedes because of it.

______

¹ \frac d{dx}e^y = e^y\frac{dy}{dx} = 1

² \frac d{dx}e^y = \frac d{dx}x = 1

³ See https://nbviewer.org/url/canonical.org/~kragen/sw/dev3/logar... for a cleaned-up version of the context where I wrote this equation down on paper the other day.

2 comments

zozbot234 196 days ago

> ... It's optimized for rewriting a formula many times.

It's not just "rewriting" arbitrarily either, but rewriting according to well-known rules of expression manipulation such as associativity, commutativity, distributivity of various operations, the properties of equality and order relations, etc. It's precisely when you have such strong identifiable properties that you tend to resort to operator-like notation in any formalism (including a programming language) - not least because that's where a notion of "rewriting some expression" will be at its most effective.

(This is generally true in reverse too; it's why e.g. text-like operators such as fadd() and fmul() are far better suited to the actual low-level properties of floating-point computation than FORTRAN-like symbolic expressions, which are sometimes overly misleading.)

link

kragen 196 days ago

Hmm, I'm not sure whether operator-like notation has any special advantage for commutativity and distributivity other than brevity. a + b and add(a, b) are equally easy to rewrite as b + a and add(b, a).

Maybe there is an advantage for associativity, in that rewriting add(a, add(b, c)) as add(add(a, b), c) is harder than rewriting a + b + c as a + b + c. Most of the time you would have just written add(a, b, c) in the first place. That doesn't handle a + b - c (add(a, sub(b, c)) vs. sub(add(a, b), c)) but the operator syntax stops helping in that case when your expression is a - b + c instead, which is not a - (b + c) but a - (b - c).

Presumably the notorious non-associativity of floating-point addition is what you're referring to with respect to fadd() and fmul()?

I guess floating-point multiplication isn't quite commutative either, but the simplest example I could come up with was 0.0 * 603367941593515.0 * 2.9794309755910265e+293, which can be either 0 or NaN depending on how you associate it. There are also examples where you lose bits of precision to gradual underflow, like 8.329957634267304e-06 * 2.2853928075274668e-304 * 6.1924494876619e+16. But I feel like these edge cases matter fairly rarely?

On my third try I got 3.0 * 61.0 * 147659004176083.0, which isn't an edge case at all, and rounds differently depending on the order you do the multiplications in. But it's an error of about one part in 10⁻¹⁶, and I'd think that algorithms that would be broken by such a small amount of rounding error are mostly broken in floating point anyway?

I am pretty sure that both operators are commutative.

link

zozbot234 196 days ago

We do often find add(a, b, c), just written as Σ(a, b, c). Similar for mul and Π. The binary sub operator can be simply rewritten in terms of add and unary minus; the fact that we write (a - b) instead of (a + [-b]) or perhaps Σ(a, [-b]) is ultimately a matter of notational convenience, but comes at some cost in mathematical elegance. Considering operators that are commutative yet not associative is not very useful; ultimately we want more from our expression rewriting than just flipping left and right subexpressions within an expression tree while keeping the overall complexity unchanged.

link

kragen 196 days ago

Usually you'd have to write that as \sum_{v \in \{a, b, c\}} v; one of the ways I think conventional math notation could in fact be improved would be by separating the aggregate function of summation from the generation of the items, allowing you to write \sum \{a, b, c\}, at the minor cost of having to write \sum_{i = 1}^N i^2 as something like \sum |_{i=1}^N i^2.

It's not conventional to write commutative-but-not-associative functions as infix operators, but I don't think that's due to some principled reason, but just because they're not very common; non-associative operators such as subtraction and function application are almost universally written with infix operators, even the empty-string operator in the case of function application. The most common one is probably the Sheffer stroke for NAND (although Sheffer himself used it to mean NOR in his 01913 paper: https://www.ams.org/journals/tran/1913-014-04/S0002-9947-191...).

You can go a bit further in the direction of logical manipulability, as George Spencer Brown did with "Laws of Form" (LoF): his logical connective, the "cross", is an N-ary negation function whose arguments are written under the operation symbol without separators between them, and he denotes one of the elementary boolean values as the empty string (let's call it false, making the cross NOR). ASCII isn't good at reproducing his "cross" notation, but if we use brackets instead, we can represent his two axioms as:

    [][] = []  (not false or not false is not false)
    [[]] =     (not not false is false)

In this way Spencer Brown harnesses the free monoid on his symbols: the empty string is the identity element of the free monoid, so appending it to the arguments of a cross doesn't change them and thus can't change the cross's value. Homomorphically, false is the identity element of disjunction, which is a bounded semilattice, and thus a monoid.

This allows not only the associative axiom but also the identity axiom to be simple string identity, which seems like a real notational advantage. (Too bad there isn't any equivalent for the commutative axiom.) It allows Spencer Brown to derive all of Boolean logic from those two simple axioms.

However, so far, I haven't found that the LoF notation is an actual improvement over conventional algebraic notation. Things like normalization to disjunctive normal form seem much more confusing:

    a(b + c)  → ab + ac          (conventional notation, rewrite rule towards DNF)
    [[a][bc]] → [[a][b]][[a][c]] (LoF notation)

It's a little less noisy in Spencer Brown's original two-dimensional representation (note that the vertical breaks between the U+2502 BOX DRAWINGS LIGHT VERTICAL characters are not supposed to be there; possibly if you paste this into a text editor or terminal it will look better)

    ┌─────    ┌────┌────
    │┌─┌──  → │┌─┌─│┌─┌─
    ││a│bc    ││a│b││a│c

but not, to my eye, any less confusing.

link

kragen 196 days ago

A thing I didn't appreciate the first time I read Spencer-Brown's book is that he actually cites Sheffer's 01913 paper, and proves Sheffer's postulates within his system in an appendix. This situates him significantly closer to the mathematical mainstream than I had thought previously, however flawed his proof of the four-color theorem may have been.

Also, the axioms I cited above are written in his notation on his gravestone: https://en.wikipedia.org/wiki/G._Spencer-Brown#/media/File:G... but I have evidently reversed left and right in my rendering of the DNF rewrite rule above. It should be:

    ─────┐   ────┐────┐
    ─┐──┐│ → ─┐─┐│─┐─┐│
    a│bc││   a│b││a│c││

His first statement of the first axiom in the book is a little more general than the version I reproduced earlier and which is inscribed on his gravestone; rather than his "form of condensation"

    [][] = []

his "law of calling" is general idempotence, i.e.,

    AA = A

although the two statements are equipotent within the system he constructs. Similarly, before stating his "form of cancellation"

    [[]] =

he phrases it as the "law of crossing", which I interpret as

    [[A]] = A

link

bmacho 196 days ago

AsciiMath makes easy equations read easy.

1 and 2 would be

  1) d/dx e^y = e^y dy/dx = 1
  2) d/dx e^y = d/dx x = 1

edit: edited, first got them wrong

link

kragen 196 days ago

When you render it for proper typesetting, do the parentheses around dy/dx disappear? (Oh, I guess you've removed them in your edit.)

If they do, it seems like an error-prone way to write your math.

If they don't, it seems like it will make your math look terrible.

Supposing that the parentheses aren't necessary, as implied by your edit: how does AsciiMath determine that e^y isn't in the numerator in "e^y dy/dx", or (worse) in the denominator in "d/dx e^y"?

It seems somewhat less noisy than the LaTeX version, but not much; assuming I can insert whitespace harmlessly:

  \frac d{dx}e^y = e^y\frac{dy}{dx} = 1
        d/dx e^y = e^y      dy/dx   = 1

  \frac d{dx}e^y = \frac d{dx}x = 1
        d/dx e^y =       d/dx x = 1

link

bmacho 195 days ago

Here is an online renderer and the description: https://asciimath.org/

The rules are basically the same as LaTeX, with saner symbol names, support for fractions, \ is not needed before symbols and () can be used instead of {}.

> Supposing that the parentheses aren't necessary, as implied by your edit: how does AsciiMath determine that e^y isn't in the numerator in "e^y dy/dx"

It seems to me that dx,dy,dz,dt behave like numbers, single letter variables and symbols (probably they are symbols, but not listed for some reason). Just as LaTeX doesn't need {} parentheses for numbers, single letter variables and symbols, AsciiMath allows omitting them too.

So `/` captures a single number/symbol/variable left to it, and that is `dy`. But if there was `du` for example it would only capture u, and you would need to put du between parentheses.

link

kragen 195 days ago

Thanks! It does better than I expected on tricky input like [0, 1/2). It seems like there are a lot of special cases, though. It does indeed remove parentheses from the output in some cases but not others.

Probably figuring out how to write things in AsciiMath is more trouble than copying and pasting them from Wikipedia though. (The alt text on equation images is the LaTeX source preceded with \displaystyle.)

How do you do \bigg(\big((4x + 2)x + 1\big)x - 3\bigg)x + 5 in AsciiMath? (((4x + 2)x + 1)x - 3)x + 5 makes all the parens the same size.

link

xigoi 195 days ago

Why would you want to manually set the sizes of parens? I always use \left \right when writing LaTeX (and having to do it is one of the reasons I hate LaTeX math notation).

link

kragen 195 days ago

Because \left( ... \right) doesn't give very readable results in cases like that; all the parens end up the same size.

link