Hacker News new | ask | show | jobs
by Koshkin 2157 days ago
Mathematical notation is great at facilitating formal manipulations. This is its critical feature, and without it we would get stuck at the level of ancient mathematics. This is the reason it was invented a few hundred years ago in the first place. That said, I find that notation is often abused in texts as a mere substitute for the normal human language which, while allowing to compress the text, does in fact nothing to help the reader better understand what is being said but rather looks like a crazy mess of characters and other marks in a multitude of fonts, styles and sizes the only purpose of which seems to be to cause an eye strain.
8 comments

I remember, as a callow college freshman waiting in the hallway of the math department to be able to go into a classroom reading an article which talked about mathematical writing and the first thing it said was to prefer English text over mathematical symbols in numerous cases (e.g., writing "For all $x$ in the Reals" over $\forall x\in\mathbb{R}$). As someone who was highly skilled in TeX at a time when such skills were still fairly rare (many colleges, if they even had TeX available on their time share systems, were still using the am fonts instead of the cm fonts as the latter had only been introduced two years earlier), I remember being upset that my backslash skills were thus denigrated.
When I teach the intro to proofs class I require that they learn LaTeX. Some students like the availability of symbols so much that they go a bit nuts. Something like this sentence: $\forall x\in\mathbb{R}$ $\exists y$ that is $>$ the number $x+1$. Sigh.
This isn't exclusive to LaTeX; I think students in general just think that the symbols make things "more mathematical". I remember feeling this way briefly when I was first exposed to things like $\forall$ and $\exists$, and it wore off. This was in the late nineties; in theory I could have had access to a typesetting system but I was writing things by hand.
As a former maths student, I loved the "compressibility" of using symbols for things like "forAll" and "thereExists". They're quicker to write and allow fitting in more information in less space, two qualities which become highly useful when taking tests in limited amounts of time on sheets of paper with limited space (albeit needing more paper to fit your test answers on is more arguable in it's negativity).

For a (even more) subjective point of view, as part of my studies I had oral practicals every 2 weeks or so where I was basically standing at a whiteboard along with 2 classmates, each solving problems given to us by an examiner for ~1 hour. It's simply easier for a lot of students to draw a symbol that's legible from 5 meters away than some words without making them huge.

"does in fact nothing to help the reader better understand what is being said"

For a mathematician it is the opposite, they just wish it would be written with symbols so that they could know precisely what the book is trying to say

I don't think this is true in general. It may be true for a novice, who needs all the available help to keep them rigorous (but even then, there is definitely room for reading-to-build-intuition), but symbols definitely slow you down while you translate them.
Do you really think

> speed = distance / time

is less clear than

> speed is the ratio of distance over time

? To me the first equation is genuinely easier to read for the purposes of understanding, not just for formal manipulation. For larger equations the difference is only more stark, not less. I have a maths PhD so your comment about symbols being for novices doesn't apply.

I suppose the beauty of the first equation is that the objects (speed, distance, time) are visually very distinct from the operations and relationships. In the second form, there's a bit of a word soup so you need to "manually" parse the sentence rather than letting your eyes (really the visual cortex) do that bit of the processing for you.

But that example is a little bit artificial, isn't it? A lot of mathematical concepts are more complex than that and sometimes symbols are not the best option. Say, for example the definition of Hausdorff space, in words and symbols:

- Any two distinct points in the space have disjoint neighbourhoods.

- ∀x,y ∈ X with x ≠ y, ∃U,V ⊂ X s.t. x ∈ U, y ∈ V and U ∩ V = ∅.

Another example would be Navier-Stokes equations, where they're much easier to understand in words than in symbols. Symbols are ok when you don't have to search too much to see what they mean and when the idea you're trying to transmit with them is relatively simple, but trying to build complicated phrases and definitions with symbols, for me, ends up being a mess.

edit: fixed hausdorff space definition, points should be distinct

I find it a bit funny that the whole discussion seems to be on what definition is better. The two complements each other perfectly, and reading them actually makes both clearer for me.

Since there is in the modern world almost no use case where you're heavily space contrained (maybe the final print version of an article for some publications ? I'm not that familiar with the research world), I don't see why you'd try to choose one instead of including both where needed.

You are comparing apples to oranges. Your English definition is missing the definition of neighborhood. To actually be an accurate comparison you'd need to add ", where a neighborhood is a set that contains an open ball that contains the point." But now, you're also missing the definition of open ball...
Doesn't that add to the point that sometimes literal definitions are better than symbolic? Either you have a symbol that says "this is a neighbourhood" or you have the symbolic definition of neighbourhood (which is missing in my symbolic definition of the space, btw, I just noticed) and then you force the reader to identify those symbols and say "oh, this is a neighbourhood". The former is the same issue as in English, and the latter adds unnecessary complexity (no people reading about Hausdorff spaces will be unfamiliar with the concept of a neighbourhood of a point).
if every definition needed to redefine definitions of its terms it would be impossible to discuss anything.

Within a certain domain, it is assumed you know basic definitions within it that can be used to talk about more complex things.

I think my overall point is that symbols can significantly make things simpler, not that they always do. It's very much like the analogy with diagrams that I made in that comment: making things visual can help enormously, but a bad diagram or equation can be as unclear, or worse, than some descriptive text.

Communication is hard, and unfortunately doing it well requires experience and thought rather than simplistic rules like "always use symbols rather than words" or vice-versa.

Curiously, you have been ambiguous in the mathematical notation. Counterproof: let x = y.
IMO It should be "Any two distinct points in..." in English as well. Precisions is hard!
Unfortunately that's not uncommon in mathematical texts either!
You are indeed right!
> - ∀x,y ∈ X with x ≠ y, ∃U,V ⊂ X s.t. x ∈ U, y ∈ V and U ∩ V = ∅.

Amusingly and to go in the direction of your argument, this is not a very rigorous definition of a Hausdorff space. You should introduce the topology τ associated with X and specify that U and V are open sets of τ.

That's not a genuine example.

Most academic texts I've read will use something like

    s = d/t
And sometimes not even explain what the components mean, because obviously 't' stands for time. That's what all their lecturers used, so why bother explain it.

Some will even make up their own notation:

    s = d(t)
And somewhere will say "f(x) in this article describes an inverse multiplicative relation", without explaining that it's actually simple division, so other academics won't find them too obvious or boorish.

Some even take it a step further and use random greek letters (without exhausting the English alphabet first ofc.), where small gamma and big gamma mean completely different and unrelated things.

That sounds more an issue of unclear naming, than with notation itself.

Edit: Though I agree, mathematical notation lends itself to using single letters to denote objects. This can of course be problematic. I'm a fan of 'pseudocode' - somewhere between natural language and rigorous notation - where possible.

>s = d/t

Slightly worse than that. That equation is always written as v = s/t. v represents velocity and s represents distance, for some reason.

I would assume that the "s" comes either from Latin spatium or German Strecke.
This is (velocity) = (displacement) / (time). s is used because of the Latin word _spatium_ for space. If I remember correctly, there was a difference between distance and displacement. (displacement is "net").
I'm certainly not saying that symbols are always bad. Rather, I am attempting to argue that the parent is false in asserting that "mathematician[s]… just wish it would be written with symbols so that they could know precisely what the book is trying to say". The Hausdorff space example is a good one: it's very easy to say what a Hausdorff space is, but if you have to spell it out formally then the definition is kind of big and ugly.
The first one isn't really symbolic though. So really you should compare:

- speed is the derivative of position with respect to time.

and

- Let x(t) be the position of an object at time t then its speed v(t) is:

   v(t) = (dx/dt)(t)
Also note that most of the time I'm just putting the symbols after the word explaining what it means, while this does allow me to use the symbolic notation for differentiation it doesn't really make the first part any shorter. Also it would be a mistake to introduce speed by just 1 specific formula (even if I didn't specify the types of the object involved) since speed is a far more general concept.
The problem is when it is not quite clear what the symbolic notation stands for. With division, that tends to be less of a problem.
* I find it quite unusual in practice for genuinely new symbolic notation to be used by an author. Maybe that just reflects the fields I read about most (information theory, Bayesian modelling, harmonic analysis).

* Usually you don't come across a journal article or even blog post with a single isolated equation. So any new or unusual notation can be explained once and reused many times.

* Even if you did have an isolated equation with unusual notation, I still think it's more clear to define the notation and then show the equation rather than spelling it out it words! (I'm sure you could find terrible counterexamples!) The visual benefit seems so great to me that it would be worth splitting it into two parts. A bit like a diagram or a graph can make things clearer even if it needs a bit of explanation.

It doesn't need to be genuinely new to be confusing. It just needs to be unfamiliar to the reader. At the very least, one should point the reader to a resource where they can read about the notation.
> * I find it quite unusual in practice for genuinely new symbolic notation to be introduced by an author.

This sentence seems to contradict the rest of your comment; did you mean "I find it quite unusual in practice for genuinely new symbolic notation to NOT be introduced by an author."?

I've got a PhD in maths and can translate symbols into concepts in my brain much quicker than words into concepts in general. Also writing symbolically forces a rigour on the writer. I've been trying to read some semi-mathematical stuff written by scientists but non-mathematicians recently and it is painful trying to figure out what they really mean!
Sure, but the problem is that it was written by non-mathematicians, not that they were not using symbols. That's kind of what I was trying to say: symbols help you be rigorous, but they slow you down, and often the complete text-on-the-page doesn't actually need all the rigour that's forced on you by the symbols.
As a math PhD I disagree. I can read math notation far faster and more accurately than ambiguous English. We don’t translate symbols. If anything, when reading English we have to translate into symbols.

For example, reading 5-7, I don’t have to translate the - symbol to the word “subtract”. I know this is -2. And I don’t translate the - in that to the word “negative,” and certainly not to the word “subtract”. And it’s vastly faster to agree 5-7=-2 is correct than “5 subtract 7 equals negative 2.”

Symbols are how mathematicians think.

I think you've got the same misconception as quietbritishjim above. Start stacking quantifiers, and the symbols get hairy much faster than the English does.
Maybe you're not used to reading quantifiers. I also find them much easier, faster, and more accurate to read, because it's from practice.

Sure you can take simple English statements, and write them with quantifiers, and claim the English is simpler. But going the other way, expressing complex items in English, is a non-starter. English is far too sloppy, whereas the quantifier version is mathematically precise.

Try converting something professional, such as Godel's incompleteness proofs, into English. Without precise quantifiers you'd quickly get lost, make mistakes, and take forever to get anywhere.

For example, look at page 17 of the proof [1], at AG(6)(a) (after the "Thus"), where there is a long statement in logic. Convert to English (near impossible, certainly not possible without something like parentheses) and tell me it's easier for a logician to read. It's not. As written it's concise and parseable without confusion for a logician.

The simple English sentence case is a tiny part of what mathematicians do.

[1] https://web.yonsei.ac.kr/bkim/goedel.pdf

Look, do you really want me to wade through sixteen pages to discover the notation first (written by someone who doesn't know about \langle and \rangle, either)? I would consider it bad argumentative form of the same water as Euler's apocryphal "does God exist" debate with Diderot.

At the very least, you will struggle to persuade me that the use of \wedge is easier to understand than the English word "and" with line breaks.

Also you've picked a specific example where the objects of study are these long strings of symbols. Of course any paper worth its salt is going to use them - they're literally the things that the paper is there to examine. It's the metamathematical statements in this context, not the quotation of the formulas under study, that I want to replace with English.

Hm - well, I'm not a mathematician, but I am a programmer and I know that there are many, many times when I've seen somebody try to describe an algorithm in what ends up being incomprehensible English followed by a code example that actually clears up what it was they were trying to say.
We have symbolic expression in programming languages already.

It helps somewhat, but it isn’t a panacea.

We still have to comment our code and add context.

Thus:

1. Write a good narrative

and then

2. Write good code

Which is to say that a mathematics text is an instance of literate programming.

It's not surprising, then, that Knuth introduced it.
Compressing does help it. To trained eyes a couple of lines is easier to read and less ambiguous than one or 2 pages of English explanation, which is hard to follow for both trained and untrained eyes if you're gonna encouter a bunch of them while reading a proof.
To me it seems language can also be interpreted as more ambiguous, where as well defined symbols and notation are far more rigorous in definition and use.
Except you have instances of misuse of those well-defined symbols and notations. Without any accompanying language it can be very difficult to not only determine that there is an error, but then also figure out how to fix it.

And other times in some fields there is a lack of well-defined symbols and notations. I don't recall the specifics, but in game theory there is some notation that is ambiguous. There are two methods or forms of representation, and the field is split on which version the notation stands for. It's also frequently not mentioned explicitly in papers, so you have to dig around a bit to see if you can find a usage that makes the meaning clear.

I don't know why we can't just have both.

my main argument is that humans only have so much sort term memory so compression is crucial for readability. Reading a proof is different than reading a book because you have to retain the whole thing with details in your mind not just vague ideas.
I find myself completely unable to understand that. I suspect we use very different methods. Notation doesn't stay with me once my eyes move past it. The notation is merely there to inform me of what is happening, and as long as I understand the meaning it doesn't matter if it was one symbol or a whole paragraph.
there are some academic disciplines which are extremely guilty of this. they are the ones that are insecure about their status as a science.
Math symbols are a minor issue for me. What confuses me the most are descriptions of mathematical concepts.

For example, Wikipedia describes a 'field' like this:

"In mathematics, a field is a set on which addition, subtraction, multiplication, and division are defined and behave as the corresponding operations on rational and real numbers do."

It doesn't make sense to me. What does it mean if an operation 'is defined' on a set? Does it mean that any 2 elements combined together using that operation always need to output an element which is also in the same set? But if that was the case then "behave as the corresponding operations on rational and real numbers do" would mean that the fields would always need to be of infinite size (have an infinite number of elements) wouldn't it? Because if the field had a limited number of elements and you added the last two (highest) elements together, the property which requires that the result also be present in the same set could not be met because the result would be greater than the highest element in that set...

The problem is that if you start with a highly abstracted math concept and you dig through all the links and definitions of sub-concepts, they all have huge gaps like this... So when you try to combine all the definitions together to make sense of that original highly abstracted concept, you end up with tens or hundreds of possible interpretations. But in fact, Math should only have 1 interpretation for each concept so this is a very bad situation to be in.

I think math definitions should be more elaborate and repetitive if necessary. They should not try to sound terse and clever. They should not assume that the reader can fill in the gaps. The most rational readers will not be able to fill in the gaps because rational people know the dangers of making assumptions.

I'm a math researcher, and I'll explain why I like these sorts of definitions.

In the first place, what you quoted is not a formal, precise definition; it is not a substitute for such a definition, nor is it intended to be one. The Wikipedia page you mention has a precise definition further down the page.

So what, then, is the purpose of the description you quoted? Why include it at all?

Because it's how mathematicians conceptualize of what a field is. It is the peg we hang our hat on; it is what we remember. A mathematician who has seen fields would be able to fill in the details; and if not, they would know to look up the precise definition in a textbook.

In short, these definitions are how we keep track of the forest at the same time as the trees.

I should note that taste differs among mathematicians, and you can find different styles of exposition in math books. Some are very formal and precise; whereas others are more informal and have lots of handwavy statements along the lines of the one you quoted.

I'd also like to point out that it is also frequent that one encounters equivalent but different formal definitions for the same mathematical structures, and this is why the informal descriptions are important as well.
> What does it mean if an operation 'is defined' on a set? Does it mean that any 2 elements combined together using that operation always need to output an element which is also in the same set?

For a binary operation f to defined on a set, f(x,y) must exist for every x and y in the set. There is no requirement that f(x,y) itself is in the set. Adding that requirement would mean that the set is "closed" under the operation f. So if we take Z+ = {1, 2, 3, ...}, ordinary division is defined on Z+, but Z+ is not closed under division, since we can get results like 2/3 that are not in Z+. Whereas division is not defined on Z0+ = {0, 1, 2, ...} because we can get undefined results like 2/0.

However, some definitions of "binary operation" include the "closed" property, so under such definitions, division would not be considered a binary operation on Z0+.

>Does it mean that any 2 elements combined together using that operation always need to output an element which is also in the same set?

Specifically in the case of a field, yes; addition needs to be defined, closed and invertible for the set; multiplication needs to be defined, closed and invertible for the set excluding the additive identity (zero).

This is a proper definition, thanks. The first sentence here is about the same length as the one on Wikipedia but it fully encapsulates the meaning without ambiguity.
You're quoting the introduction of the article, which is notoriously a fuzzy abstract in all the wikipedia articles about mathematical concepts. Let's quote the actual (textual) formal definition (sec. 1.1) as it would stand in a textbook:

> Formally, a field is a set F together with two binary operations on F called addition and multiplication. A binary operation on F is a mapping F × F → F, that is, a correspondence that associates with each ordered pair of elements of F a uniquely determined element of F. The result of the addition of a and b is called the sum of a and b, and is denoted a + b. Similarly, the result of the multiplication of a and b is called the product of a and b, and is denoted ab or a ⋅ b. These operations are required to satisfy the following properties, referred to as field axioms. In these axioms, a, b, and c are arbitrary elements of the field F. [...]

Still, i don't know how to say it in another way but you probably don't have much experience in mathematics, even the first quote is arguably quite accurate.

> Does it mean that any 2 elements combined together using that operation always need to output an element which is also in the same set?

Yes, unless told otherwise an operation is an internal binary operation, it's really the most common form. When it is not the output is notable enough to be specified.

> But if that was the case then "behave as the corresponding operations on rational and real numbers do" would mean that the fields would always need to be of infinite size wouldn't it? Because if the field had a limited number of elements and you added the last two (highest) elements together [...]

No, the important point here is your use of "highest". A set by default only has equality (and mappings, in and out) but no order relationship. So the most conservative interpretation of "behave as the corresponding operations on rational" would be to only include stuff that can be written using the 4 operations and equality, not ordering.

---

Maybe i'm biaised by the fact that i know what a field is, but still, this particular intro is also how i would present a field: give the most common example and say which operations it has. It sure can create false intuitions like yours about the size, but this will always be the case when we use non-normalized language.

If informal descriptions confuse you, skip them and read actual definitions instead.

I actually like informal definitions a lot. I think they serve two different purposes:

1. For beginners they usually soften the blow of a fully rigorous definition, letting them get an idea of the concept before getting it exactly.

2. For experts they can often suggest what the exact definition is faster than it would be to read a precise definition!

But if you don't get anything out of them, you can skip 'em. Definitions are more important: informal descriptions are there to help you grasp the definition faster and to help you develop an intuition for the concept being defined. If you can do those things faster from just a rigorous definition, you don't need an informal description. (But as I said, I feel informal descriptions benefit both beginners and experts, so I'd also suggest practicing reading them more to get a feeling for what kind of details people tend to omit or emphasize.)

It's not about 'informal definitions'. I also really like informal definitions but not the way that most mathematicians currently tend to write them.
Wait, are you complaining about how mathematicians currently tend to write informal descriptions of concepts or about the informal descriptions in the intros to Wikipedia articles? I think those things are pretty different.
It sounds very much to me like you would like mathematicians to change our notation to accommodate someone who has not put in the effort to learn mathematics. Do you wish the same from structural engineers? Programmers? Physicists? Medical doctors? Musicians?

> "In mathematics, a field is a set on which addition, subtraction, multiplication, and division are defined and behave as the corresponding operations on rational and real numbers do."

This isn't a mathematical definition of a field. This is an encyclopedia's intuitive definition. It is a good one, in my opinion, as it'll evoke the correct definition in a person who is trained at mathematics, and hopefully convey the gist of the idea to someone who is not. Very good for one sentence! But a mathematical definition it is not!

A mathematical definition of a field would be as follows:

BEGIN DEFINITION

A field is a set F together with two functions A:F×F→F and M:F×F→F and two elements z∈F and o∈F that together satisfy the following properties:

ASSOCIATIVITY OF A: A(A(a, b), c) = A(a, A(b, c)) for all a,b,c∈F.

ASSOCIATIVITY OF M: Same as above with M in place of A.

COMMUTATIVITY OF A: A(a, b) = A(b, a) for all a,b∈F.

COMMUTATIVITY OF A: Same as above with M in place of A.

NEUTRALITY OF z WRT A: A(a, z) = a for all a∈F.

NEUTRALITY OF e WRT M: M(a, o) = a for all a∈F.

INVERSE FOR A: For all a∈F, there exists an element na∈F such that A(a, na) = z.

INVERSE FOR M: For all a∈F such that a≠z, there exists an element ra∈F such that M(a, ra) = o.

M DISTRIBUTES OVER A: M(a, A(b, c)) = A(M(a, b), M(a, c)) for all a,b,c∈F.

END DEFINITION.

(This definition presupposes that one knows what a set is under a standard framework.)

Since this notation is cumbersome, it is common to write A(a,b) as a+b and M(a,b) as a·b or ab. Likewise, z if often written 0 and and o is often written 1 (or e). Similarly, na is often written -a and ra is often written 1/a, but do take care to recall that "-a" and "1/a" are just symbols. One typically compresses notation even further, and writes "a + -b" as "a - b" (not to be confused for the juxtaposition of a and -b).

Do you feel better about this definition than Wikipedia's informal one? I invite you to scribble out a verification that for example the reals form a field under this definition (with ordinary addition as A, ordinary multiplication as M, ordinary 0 as z, ordinary 1 as o).

> Because if the field had a limited number of elements and you added the last two (highest) elements together, the property which requires that the result also be present in the same set could not be met because the result would be greater than the highest element in that set...

You're reading too much into "behave like the corresponding operations on real numbers do". One does not demand that addition preserves order. Notice how there is no reference to ordering or elements being "larger" or "smaller" in the definition above. It is for example the case that in the field of two elements, 1+1=0. That's fine.

> The problem is that if you start with a highly abstracted math concept and you dig through all the links and definitions of sub-concepts, they all have huge gaps like this

Not at all. You are trying to do rigorous mathematics using informal statements (not to detract from the informal statement; someone who has seen the formal definition of a lot of mathematical structure can almost surely reconstruct the correct formal definition of a field in a second from the informal one).

> I think math definitions should be more elaborate and repetitive if necessary.

So mathematicians should make communication between ourselves more cumbersome in order to please outsiders? I'm sorry, I don't mean to sound like a gatekeeper, but this is ridiculous.

> They should not try to sound terse and clever. They should not assume that the reader can fill in the gaps.

Do you demand this of musicians and engineers and chefs and mechanics and pilots and doctors and nurses too?

> The most rational readers will not be able to fill in the gaps because rational people know the dangers of making assumptions.

The most rational readers who have studied mathematics will. To a point, of course. There is often a tradeoff to be made, but your blanket statement is just plain wrong.

The user joshuaissac gave a very good description of a 'field' as a response to my comment and it was about the same length as the definition on Wikipedia. It shows that it's possible.

I don't see why certain knowledge should be out of reach of those who are not involved directly in that field. I could explain complex software engineering concepts to a layman. They wouldn't be able to use that knowledge to implement the software themselves, but they would be able to use the knowledge to make good high level decisions about it; for example to decide which of two solutions is better given a specific problem.

> The user joshuaissac gave a very good description of a 'field' as a response to my comment and it was about the same length as the definition on Wikipedia. It shows that it's possible.

Sure, his definition is also a good one. It leaves out a lot, though. Which is fine, if one can assume the reader knows the context. My definition, too, leaves out a lot (it assumes set theory), and rests on the informal language known as English.

> I don't see why certain knowledge should be out of reach of those who are not involved directly in that field.

It isn't. The content of mathematics research papers may well be out of reach, but that's quite natural, don't you think? I, as a mathematician, do not expect to be able to read research papers on chemistry without putting in a lot of work.

> I could explain complex software engineering concepts to a layman.

OK. It does not follow from that that everything can be explained to a layman. Some things are easier to explain with layman analogies and mental images than others. However, to keep this fair, I think you should see how many laymen can follow Wikipedia articles on complex software engineering topics with ease! That is, afterall, where we started this discussion.

Looks like we read the same books.
> but rather looks like a crazy mess of characters and other marks in a multitude of fonts

You're projecting (in the psychological rather than set-theoretic way).