omg just had a look and this one is just everything I hate about mathematics and academia.
Starts with lots of random definitions, remarks, axioms and introducing new sign language while completely disregarding introducing what it‘s supposed to do, explain or help with.
All self-aggrandization by creating complexity, zero intuition and simplification. Isn‘t there anybody close to the Feynman of Linear Algebra?
Yeah, a good example is on the second page of the first chapter:
> Remark. It is easy to prove that zero vector 0 is unique, and that given
v ∈ V its additive inverse −v is also unique.
The is the first time the word "unique" is used in the text. Students are going to have no idea whether this is meant in some technical sense or just conventional English. One can imagine various meanings, but that doesn't substitute for real understanding.
This is actually why I feel that mathematical texts tend to be not rigorous enough, rather than too rigorous. On the surface the opposite is true - you complain, for instance, that the text jumps immediately into using technical language without any prior introduction or intuition building. My take is that intuition building doesn't need to replace or preface the use of formal precision, but that what is needed is to bridge concepts the student already understands and has intuition for to the new concept that the student is to learn.
In terms of intuition building, I think it's probably best to introduce vectors via talking about Euclidean space - which gives the student the possibility of using their physical intuitions. The student should build intuition for how and why vector space "axioms" hold by learning that fundamental operations like addition (which they already grasp) are being extended to vectors in Euclidean space. They already instinctively understand the axiomatic properties being introduced, it's just that the raw technical language being thrown at them fails to connect to any concept they already possess.
> This is actually why I feel that mathematical texts tend to be not rigorous enough, rather than too rigorous.
The thing that mathematicians refuse to admit is that they are extremely sloppy with their notation, terminology and rigor. Especially in comparison to the average programmer.
They are conceptually/abstractly rigorous, but in "implementation" are incredibly sloppy. But they've been in that world so long they can't really see it / just expect it.
And if you debate with one long enough, they'll eventually concede and say something along the lines of "well math evolved being written on paper and conciseness was important so that took priority over those other concerns." And it leaks through into math instruction and general math text writing.
Programming is forced to be extremely rigorous at the implementation level simply because what is written must be executed. Now engineering abstraction is extremely conceptually sloppy and if it works it's often deemed "good enough". And math generally is the exact opposite. Even for a simple case, take the number of symbols that have context sensitive meanings and mathematicians. They will use them without declaring which context they are using, and a reader is simply supposed to infer correctly. It's actually somewhat funny because it's not at all how they see themselves.
> The thing that mathematicians refuse to admit is that they are extremely sloppy with their notation, terminology and rigor. Especially in comparison to the average programmer.
Not sure why you say that. Mathematicians are pretty open about it. The well known essay On proof and progress on mathematics discusses it. It is written by a Fields medalist.
This drove me mad when I had to do introductory maths at uni. Maths as written did seem pretty sloppy and not at all like a programming language whose expressions I could parse as I expected. Obv most simple algebra looks about as you'd expect but I clearly recall feeling exactly what you describe in some cases, and commented upon it to the lecturer about it asking why it was that way during a tutorial. He humoured me, was a good guy.
But I think mathematicians probably have a point - it did evolve that way over a long time and anyone practicing it daily just knows how to do it and they're not going to do a thorough review and change now.
It's us tourists that get thrown for a loop, but so it goes. It's not meant for us.
> Maths as written did seem pretty sloppy and not at all like a programming language whose expressions I could parse as I expected.
Look at Lean's mathlib: that's what fully formal mathematics looks like. It's far too verbose to be suitable for communicating with other people; you might as well try to teach an algorithms course transistor by transistor.
You’re confusing syntax and semantics. Programmers write code for syntax machines (Turing machines). The computers care a lot about syntax and will halt if you make an error. They do not care at all about semantics. A computer is happy to let you multiply a temperature in Fahrenheit times a figure in Australian dollars and subtract the volume of the earth in litres, provided that these numbers are all formatted in a compatible enough way that they can be implicitly converted (this depends on the programming language but many of them are quite liberal at this).
If you want the computer to stop you from doing such nonsense, you’ve got to put in a lot of effort to make types or contracts or write a lot of tests to avoid it. But that’s essentially a scheme for encoding a little bit of your semantics into syntax the computer can understand. Most programmers are not this rigorous!
Mathematicians, on the other hand, write mathematics for other humans to read. They expect their readers to have done their homework long before picking up the paper. They do not have any time to waste in spelling out all the minutiae, much of which is obvious and trivial to their peers. The sort of formal, syntax-level rigour you prefer, which can be checked by computers, is of zero interest to most mathematicians. What matters to them, at the end of the day, is making a solid enough argument to convince the establishment within their subfield of mathematics.
But programmers are expected to get the semantics right. Sure, it happens to mismatch temperatures and dollars, but it’s called a bug and you will be expected to fix it
Why do mathematicians hide their natural way of thinking ? They provide their finished work and everyone is supposed to clap. Why can't they write long articles like about false starts, dead ends and so on. It's only after magazines like Quanta and YouTube channels that we get to feel the thinking process. Math is not hard. Atleast the mathematics we are expected to know.
Mathematics is extremely hard. The math people are expected to know for high school is not hard, but that is such a minuscule amount of math compared to what we (humans) know, collectively.
Mathematicians do speak and also write books about the thinking process. It’s just very difficult and individualized. It’s a nonlinear process with false starts and dead ends, as you say.
But you can’t really be told what it feels like. You have to experience it for yourself.
>Even for a simple case, take the number of symbols that have context sensitive meanings and mathematicians. They will use them without declaring which context they are using, and a reader is simply supposed to infer correctly.
Yes!! Like, oh, you didn't know that p-looking thing (rho) means Pearson's correlation coefficient? That theta means an angle? Well just sit there in ignorance because I'm not going to explain it. And those are the easy ones!
My experience with the average programmer is...different from yours. The software development field is exceptionally bad in this regard. Physicists are mathematically sloppy sometimes (why, yes, I will just multiply both sides by `dy` and take as many liberties with operators, harmonics/series, and vector differential operations as I care to, thanks).
Mathematics, like any other academic field, has jargon (and this includes notation, customary symbols for a given application, etc.), and students of the field ought to learn the jargon if they wish to be proficient. On the other hand, textbooks meant to instruct ought to teach the jargon. It's been forever since I've opened a mathematics textbook; I don't recall any being terribly bad in this regard.
Well I have a different approach. Sometimes I write and hack it to solve a particular problem. The code might be elegant or not, but if you understand the problem you can probably grok the code.
Next I generalize it a bit. Specific variables configurable parameters. Something that happened implicitly or with a single line of code gets handled by its own function. Now it’s general but makes much less sense at first because it’s no longer tied to one problem, but a whole set of problems. It’s a lot less teachable and certainly not self-evident any more.
The problem with math education is that we think the solution approach would be inherently superior to the first, and would make a better textbook—because it’s more generic. But that is not how real people learn—they would all “do” math the first way. By taking away the ability of the student to do the generalization themselves we are depriving them of the real pleasure of programming (or math).
Maybe back when paper was scarce this approach made sense but not any more.
Ideally I would love to present a million specific solutions and let them generalize themselves. That is exactly how we would train a ANN. Not be regurgitating the canned solution but by giving it all sorts of data and letting it generalize for itself. So why don’t we do this for human students? When it comes to education I think people have a blind spot towards how learning is actually done.
Notation and terminology? Sure, some explanations and mechanical manipulations are elided in mathematics because with context they're clear.
Rigor? Ha, you have got to be kidding me. Math is rigorous to a fault. Typical computer programming has no rigor whatsoever in comparison. A rigid syntax is not rigor. Syntax, in turn, is certainly not the difficult part of developing software.
This, really. Sometimes, when reading math papers, you find that they end up being very hand-wavy with the notation, e.g. with subscripting, because "it's understood". But without extensive training in mathematics, a lot of it is not understood.
Programmers will use languages with a lot of syntactic sugar, and without knowing the language, code can be pretty difficult to understand when it is used. But even then, you can't be sloppy, because computers are so damn literal.
> The thing that mathematicians refuse to admit is that they are extremely sloppy with their notation, terminology and rigor.
The refuse part is imo ver dependent on the person. Nearly all of my professors for theoretical cs courses just plainly said that their "unique" notation is just their way because they like it.
It's more or less just a simple approach to alter the language to fit your task. This is also not unfamiliar to the programmers who may choose a language based on the task, with, e.g. Fortran for vector based calculus or C for direct hardware access.
Bro I know this feel. Even books teaching algorithms being written by mathematicians are error everywhere.
They never state the type, class, no comment, no explanation, read exceed the last index... This list can go endlessly. When they say "lets declare an empty set for variable T", you don't know whether the thing is a list, set, tuple, ndarray, placeholder for a scalar, or a graph.
Some even provide actual code, however, never actually run the code to verify their correctness.
Try this guy then, he's got a PhD in mathematics from the California Institute of Technology from a thesis Finite Semifields and Projective Planes but he's written a bunch of stuff on algorithms and will write you a check for any errors you find in his work: https://en.wikipedia.org/wiki/Donald_Knuth
I believe that any mathematician that took a differential geometry class must have already realized this, the notation is so compressed and implicit that some proofs practically are "by notation" as it can be a dauting prospect to expand a dozen indexes.
The average computer scientist (not only "programmer", as a js dev would be) never wrote lean/coq or similar, and is not aware of the Curry-Haskell like theorems and their implications.
I think you entirely missed the point. GP put it well:
>> They are conceptually/abstractly rigorous, but in "implementation" are incredibly sloppy.
Maturity in concept-space and the ability to reason abstractly can be achieved without the sort of formal rigor required by far less abstract and much more conceptually simple programming.
I have seen this first hand TAing and tutoring CS1. I regularly had students who put off their required programming course until senior year. As a result, some were well into graduate-level mathematics and at the top of their class but struggled deeply with the rigor required in implementation. Think about, e.g., missing semi-colons at the end of lines, understanding where a variable is defined, understanding how nested loops work, simple recursion, and so on. Consider something as simple as writing a C/Java program that reads lines from a file, parses them according to a simple format, prints out some accumulated value from the process, and handles common errors appropriately. Programming requires a lot more formal rigor than mathematical proof writing.
but programmers don't write underspecified notational shortcuts, because those are soundly rejected as syntax errors by the compiler or interpreter
this is not about semantics (like dependent types etc) this is just syntax. it works like this in any language. the only way to make syntax accepted by a compiler is to make it unambiguous
... maybe LLMs will change this game and the programming languages of the future will be allowed to be sloppy, just like mathematics
> Remark. It is easy to prove that zero vector 0 is unique, and that given v ∈ V its additive inverse −v is also unique.
I'm sorry, this book is meant for the audience who can read and write proofs. Uniqueness proofs are staple of mathematics. If word "unique" throws you off, then this book is not meant for you.
I'd go a bit further and say that if you're not comfortable with the basics of mathematical proofs, then you're not ready for the subject of linear algebra regardless of what book or course you're trying to learn from. The purely computational approach to mathematics used up through high school (with the oddball exception of Euclidean geometry) and many introductory calculus classes can't really go much further than that.
Or, you know, mathematics can be viewed as a powerful set of tools…
Somehow I seem to remember getting through an engineering degree, taking all the optional extra math courses (including linear algebra), without there ever being a big emphasis on proofs. I’m sure it’s important if you want to be a mathematician, but if you just want to understand enough to be able to use it?
> taking all the optional extra math courses (including linear algebra), without there ever being a big emphasis on proofs
Sorry to break it to you, but you didn't take math classes. You took classes of the discipline taught in high school under the homonymous name "math". There is a big difference.
It's the same difference as there is between what you get taught in grade school under the name "English" (or whatever is the dominant language where you live): the alphabet, spelling, pronunciation, basic sentence structure... And what gets taught in high school under the name "English": how to write essays, critically analyze pieces of literature, etc. The two sets of skills are almost completely unrelated. The first is a prerequisite for the second (how can you write an essay if you can't write at all?), so somehow the two got the same name. But nobody believes that winning a spelling bee is the same type of skill as writing a novel.
I know it's a shock to everyone who enters a university math course after high school. Many of my 1st year students are confounded about the fact that they'll be graded on their ability to prove things. They expect the equivalent of cooking recipes to invert matrices, compute a GCD, solve a quadratic equation, or whatever, and balk at anything else. I want them to understand logical reasoning, abstract concepts, and the difference between "I'm pretty sure" and "this is an absolute truth". There's a world of difference, and most have to wait a few years to develop enough maturity to finally get it.
I don't know whom to agree with. Maybe there need to be two tracks, and it might not even depend on discipline, but just personal preference. Do you love math as an art form, or as a problem solving tool? Or both?
I went back and forth. I was good at problem solving, but proofs were what made math come alive for me, and I started college as a math major. Then I added a physics major, with its emphasis on problem solving. But I would have struggled with memorizing formulas if I didn't know how they were related to one another.
Today, K-12 math is taught almost exclusively as problem-solving. This might or might not be a realistic view of math. On the one hand, very few students are going to become mathematicians, though they should at least be given a chance. On the other hand, most of them are not going to use their school math beyond college, yet math is an obstacle for admission into some potentially lucrative careers.
At my workplace, there's some math work to be done, but only enough to entertain a tiny handful of "math people," seemingly unrelated to their actual specialty.
Are there courses/books on "applied linear algebra"? You are right in some sense, but wrong in some sense. Linear algebra at a 100 level without any really deep understanding is still incredibly useful. Graphics (i guess you sort of call this out), machine learning etc.
Most university math curriculums have a clear demarcation between the early computation-oriented classes (calculus, some diff eq.) and later proof-oriented classes. Traditionally, either linear algebra or abstract algebra is used as the first proof-oriented course, but making that transition to proof-based math at the same time as digesting a lot of new subject matter can be brutal, so many schools now have a dedicated transition course (often covering a fair bit of discrete mathematics). But there's still demand for textbooks for a linear algebra course that can serve double-duty of teaching engineering students a bag of tricks and give math students a reasonably thorough treatment of the subject.
>I'd go a bit further and say that if you're not comfortable with the basics of mathematical proofs, then you're not ready for the subject of linear algebra
I can't write a proof to save my life, but I'm going to keep using linear algebra to solve problems and make money, nearly every day. Sorry!
We had this discussion about Data Science years ago: "you aren't a real Data Scientist unless you fully understand subjects X, Y, Z!"
Now companies are filled to the brim with Data Scientists who can't solve a business problem to save their life, and the companies are regretting the hires. Nobody cares what proofs they can write.
There are (at least) two different things we're calling "linear algebra here", roughly speaking one is building the tools and one is using the tools.
The mathematicians need to understand the basics of of mathematical proofs to learn how to prove new interesting (and sometimes useful) stuff in linear algebra. You have to do the math stuff in order to come up with some new matrix decomposition or whatever.
The engineers/data scientists/whatever people just need to understand how to use them.
You don't need to know how to build a car to drive one. The mathematicians are building the cars, you're using them.
I don't think I've ever done more rote manual calculation than for my undergrad linear algebra class! On tests and homework just robotically inverting matrices, adding/subtracting them (I think I even had to do some of that in high school algebra), multiplying them (yuck). It was tedious and frustrating and anything but theoretical.
I've learned linear algebra course quality varies substantially. One acquaintance whom I met after they graduated a big university in Canada reported having to do things like by-hand step-by-step reduced row echelon form computations for 3x4 matrices or larger. I had to do such things in "Algebra 2" in junior high (9th grade), until our teacher kindly showed us how to do the operations on the calculator and stopped demanding work steps. If we had more advanced calculators (he demoed on some school-owned TI-92s, convincing me to ask for a TI-89 Titanium for Christmas) we could use the rref() function to do it all at once.
In my actual linear algebra class in freshman year college we were introduced to a lot of proper things I wish I had seen before, along with some proofs but it wasn't proof heavy. I did send a random email to my old 9th grade teacher about at least introducing the concept of the co-domain, not just domain and range, but it was received poorly. Oh well. (There was a more advanced linear algebra class but it was not required for my side. The only required math course that I'd say was proof heavy was Discrete Math. An optional course, Combinatorial Game Theory, was pretty proof heavy.)
Linear algebra is usually a required (or at least strongly encouraged) course for an undergraduate degree in basically any engineering discipline, and it is usually not preceded by a course in "the basics of mathematical proofs".
> this book is meant for the audience who can read and write proofs
It seems like the opposite is true:
"It is intended for a student who, while not yet very familiar with abstract reasoning, is willing to study more [than a] "cookbook style" calculus type course."
(from the link).
If your point is one can't learn linear algebra before learning "abstract [mathematical] reasoning"...don't think you're the main target audience of a subject as practical as linear algebra.
> Besides being a first course in linear algebra it is also supposed to be a first course introducing a student to rigorous proof, formal definitions---in short, to the style of modern theoretical (abstract) mathematics.
So I think it's fair to say that the book (ought to) assume zero knowledge of proofs, contra your parent's claim that the audience is expected to be able to read and write proofs.
From the second paragraph of the introduction to the book we are discussing:
> Besides being a first course in linear algebra it is also supposed to be a first course introducing a student to rigorous proof, formal definitions---in short, to the style of modern theoretical (abstract) mathematics.
So it's certainly meant to be the first math book one sees in their life that discusses rigorous proofs.
A vector space is defined as having a zero vector, that is, a vector v such that for any other vector w, v + w = w.
Saying the zero vector is unique means that only one vector has that property, which we can prove as follows. Assume that v and v’ are zero vectors. Then v + v’ = v’ (because v is a zero vector). But also, v + v’ = v’ + v = v, where the first equality holds because addition in a vector space is commutative, and the second because v’ is a zero vector. Since v’ + v = v’ and v’ + v = v, v’ = v.
We have shown that any two zero vectors in a vector space are in fact the same, and therefore that there is actually only one unique zero vector per vector space.
We used this in my Discrete Mathematics class (MATH 2001 @ CU Boulder) (it is a pre-requisite for most math classes). The section about truth tables did overlap a bit with my philosophy class (PHIL 1440 Critical Thinking)
> The above statement of zero vector is unique, I have no idea what is that means.
In isolation, nothing. (Neither does the word “vector”, really.) In the context of that book, the idea is more or less as follows:
Suppose you are playing a game. That game involves things called “vectors”, which are completely opaque to you. (I’m being serious here. If you’ve encountered about some other thing called “vectors”, forget about it—at least until you get to the examples section, where various ways to implement the game are discussed.)
There’s a way to make a new vector given two existing ones (denoted + and called “addition”, but not the same as real-number addition) and a way to make a new vector given an existing one and a real number (denoted by juxtaposition and called “multiplication”, but once again that’s a pun whose usefulness will only become apparent later) (we won’t actually need that one here). The inner workings of these operations in turn are also completely opaque to you. However, the rules of the game tell you that
1. It doesn’t matter in which order you feed your two vectors into the “addition” operation (“add” them): whatever existing vectors v and w you’re holding, the new vector v+w will turn out to be the same as the other new vector w+v.
2. When you “add” two vectors and then “add” the third to the result, you’ll get the exact same thing as when you “add” the first to the “sum” of the second and third; that is, whatever the vectors u, v, and w are, (u+v)+w is equal to u+(v+w).
(Why three vectors and not four or five? It turns out that you have the rule for three, you can prove those for four, five, and so on, even though there are going to be many more ways to place the parens there. See Spivak’s “Calculus” for a nice explanation, or if you like compilers, look up “reassociation”.)
3. There is [at least one] vector, call it 0, such that adding it to anything else doesn’t make a difference: for this distinguished 0 and whatever v, v+0 is the same as v.
Let’s now pause for a moment and split the last item into two parts.
We’ll say a vector u deserves to be called a “zero” if, whatever other vector we take [including u itself!], we will get it back again if we add u to it; that is, for any v we’ll get v+u=v.
This is not an additional rule. It doesn’t actually tell us anything. It’s just a label we chose to use. We don’t even know if there are any of those “zeros” around! And now we can restate rule 3, which is a rule:
3. There is [at least one] “zero”.
What the remark says is that, given these definitions and the three rules, you can show, without assuming anything else, that there is exactly one “zero”.
(OK, what the remark actually says is that you can prove that from the full set of eight rules that the author gives.
But that is, frankly, sloppy, because the way rule 4 is phrased actually assumes that the zero is unique: either you need to say that there’s a distinguished zero such that for every v there’s a w with v+w= that zero, or you need to say that for every v there’s a w such that v+w is a zero, possibly a different one for each v. Of course, it doesn’t actually matter!—there can only be one zero even before we get to rule 4. But not making note of that is, again, sloppy.
This kind of sloppiness is perfectly acceptable among people who have seen this sort of thing before, say done finite groups or something like that. But if the book is supposed to be give a first impression, this seems like a bad idea. Perhaps a precalculus course of some sort is assumed.
Read Spivak, seriously. He’s great. Not linear algebra, though.)
I think a lot of people just need an opportunity to see math demonstrated in a more tangible way.
For example, I learned trig, calculus, and statistics from my science classes, not from my math classes (and that's despite getting perfect A's in all of my math classes). In math class, I was just mindlessly going through the motions and hating every second of it, but science classes actually taught me why it worked and showed me the beauty and cleverness of it all.
I think most college math depts have "applied math" majors. I like both sides of math, but I found it incredibly frustrating when I would try to study just the equations for that chapter, only to be tested on a word problem. The whole "trying to trick you" conspiracy turned me off to college in general. If I'm trying to teach someone how to do something, I would show them "A, then, B, and you get C" , then assign a variety of homework of that form, and on the test, say "A, then B, then _____" and they would be correct if they concluded C. But for some reason this method isn't used much in university. If I wanted to teach a student how to start with C and deconstruct into A, B , thats what I would have taught them!
If you study mathematics at a rigorous level then you learn by writing proofs. Then you will rack your brain for hours or even days trying to figure out how to prove some simple things. It is not at all “going through the motions” at that point!
A first course in linear algebra still assumes background information, because linear algebra is not a basic topic. It’s not meant to be a first course in math. Math builds on itself and it would be incredibly inconvenient if every course everywhere would have to include a recap of basic things. And proofs are among the most fundamental things in math!
Programming courses or articles or books, beyond the 101 level, don’t teach you again and again the basics of declaring a variable and writing a loop either! No field does that.
Wrt linear algebra in particular, there are plenty of resources aimed at programmers thanks to its relevance in computer graphics and so on. They typically skip proofs and just tell you that this is how matrix multiplication is defined, but they don’t teach you math, merely using math. Which can be plenty enough to an engineer.
This is not snobbery, some subjects just have prerequisites.
You can't learn computer science without having a good sense of what an "algorithm" is, you have to know how to read and write and understand algorithms. Similarly you can't learn math without having a good sense of what a proof is, reading, writing and understanding proofs is the heart of what math is.
Even more strongly, trying to learn math without a solid understanding of how proofs work is something like studying English literature while refusing to learn how to read English.
> Even more strongly, trying to learn math without a solid understanding of how proofs work is something like studying English literature while refusing to learn how to read English.
It depends why you're trying to learn math. Are you interested in math for math's sake, or are you trying to actually do something with it?
If it's the former, then yeah, you need proofs. Otherwise, like in your analogy, it's like studying English literature without knowing any english grammar rules.
But if you're trying to apply the math, if you're studying linear algebra because it's useful rather than for its own sake, then you don't need proofs. To follow the same analogy, it's like learning enough English to be conversational and get around America, without knowing what an "appositive" is.
The software industry, similarly, is full of people who make use of computer science concepts, without having rigorously studying computer science. You can't learn true "computer science" without an understanding of discrete math, but you can certainly get a job as an entry-level SWE without one. You don't need discrete math to learn python, see that it's useful, and do something interesting with it.
The same applies to linear algebra. Everyone who does vector math doesn't need to be able to prove that the tools they are using work. If everyone who does vector math is re-deriving their math from first principles, then something's gone terribly wrong. There's a menu of known, well-defined treatments that can be applied to vectors, and one can read about them and trust that they work without having proven why they work.
EDIT: it occurs to me, an even stronger analogy of this point, is that it is entirely possible to study computer science, without having any understanding of electrical engineering or knowing how a transistor works.
Mathematicians are well aware of complaints like these about introductions to their subjects, by the way.
It is for a reason that this book introduces the theory of abstract vector spaces and linear transformations, rather than relying on the crutch of intuition from Euclidean space. If you want to become a serious mathematician (and this is a book for such people, not for people looking for a gentle introduction to linear algebra for the purposes of applications) at some point it is necessary to rip the bandaid of unabstracted thinking off and engage seriously with abstraction as a tool.
It is an important and powerful skill to be presented with an abstract definition, only loosely related to concrete structures you have seen before, and work with it. In mathematics this begins with linear algebra, and then with abstract algebra, real analysis and topology, and eventually more advanced subjects like differential geometry.
It's difficult to explain to someone whose exposure to serious mathematics is mostly on the periphery that being exposed forcefully to this kind of thinking is a critical step to be able to make great leaps forward in the future. Brilliant developments of mathematics like, for example, the realisation that "space" is an intrinsic concept and geometry may be done without reference to an ambient Euclidean space begin with learning this kind of abstract thinking. It is easy to take for granted the fruits of this abstraction now, after the hard work has already been put in by others to develop it, and think that the best way to learn it is to return back to the concrete and avoid the abstract.
The point of starting with physical intuition isn't to give students a crutch to rely on, it's to give them a sense of how to develop mathematical concepts themselves. They need to understand why we introduce the language of vector spaces at all - why these axioms, rather than some other set of equally arbitrary ones.
This is often called "motivation", but motivation shouldn't be given to provide students with a reason to care about the material - rather the point is to give them an understanding of why the material is developed in the way that it is.
To give a basic example, high school students struggle with concepts like the dot and cross products, because while it's easy to define them, and manipulate symbols using them, it's hard to truly understand why we use these concepts and not some other, e.g. the vector product of individual components a_1 * b_1 + a_2 * b_2 ...
While it is a useful skill to be adroit at symbol manipulation, students also need an intuition for deciding which way to talk about an unfamiliar or new concept, and this is an area in which I've found much of mathematics (and physics) education lacking.
Physical intuition isn’t going to help when you’re dealing with infinite-dimensional vector spaces, abstract groups and rings, topological spaces, mathematical logic, or countless other topics you learn in mathematics.
>... rather than relying on the crutch of intuition from Euclidean space
Euclidean space is not a good crutch, but there are other, much more meaningful, crutches available, like (orthogonal) polynomials, Fourier series etc. Not mentioning any motivations/applications is a pedagogical mistake IMO.
I think we need some platform for creating annotated versions of math books (as a community project) - that could really help.
On that of course I agree, but mathematicians tend to "relegate" such things to exercises. This tends to look pretty bad to enthusiasts reading books because the key examples aren't explored in detail in the main text but actually those exercises become the foundation of learning for people taking a structured course, so its a bit of a disconnect when reading a book pdf. When you study such subjects in structured courses, 80%+ of your engagement with the subject will be in the form of exercises exploring exactly the sorts of things you mentioned.
Axler serves as an adequate first introduction to linear algebra (though it is intended to be a second, more formal, pass through. Think analysis vs calculus), but it isn't intended to be a first introduction to all of formal mathematics! A necessary prereq is understanding some formal language used in mathematics- what unique means is included in that.
Falling entirely back on physical intuition is fine for students who will use linear algebra only in physical contexts, but linear algebra is often a stepping stone towards more general abstract algebra. That's what Axler aims to help with, and with arbitrary (for instance) rings there isn't a nice spacial metaphor to help you. There you need to have developed the skill of looking at a definition and parsing out what an object is from that.
> This is actually why I feel that mathematical texts tend to be not rigorous enough, rather than too rigorous.
This is _precisely_ the opinion of Roger Godement, French mathematician and member of the Bourbaky group.
I would highly recommend his books on Algebra. They are absolutely uncompromising on precision and correctness, while also being intuitive and laying down all the logical foundations of their rigor.
Overall, I cannot recommend enough the books of the Bourbaky group (esp. Dieudonne & Godement). They are a work of art in the same sense that TAOCP is for computer science.
Unfortunately, some of the Bourbaki books need to be read in French, because the typesetting on the English translations is so atrocious as to be unreadable; as a consolation, the typesetting on the original French is, as always, immaculate.
I absolutely agree about additional rigor and precision making math easier to learn. Only after you're familiar with the concepts can you be more lazy.
That's the approach taken by my favorite math book:
> This is actually why I feel that mathematical texts tend to be not rigorous enough, rather than too rigorous. On the surface the opposite is true - you complain, for instance, that the text jumps immediately into using technical language without any prior introduction or intuition building. My take is that intuition building doesn't need to replace or preface the use of formal precision, but that what is needed is to bridge concepts the student already understands and has intuition for to the new concept that the student is to learn.
If you read the book in the original post you may find it's absolutely for you.
Axler assumes you know only the real numbers, then starts by introducing the commutative and associative properties and the additive and multiplicative identity of the complex numbers[1]. Then he introduces fields and shows that, hey look we have already proved that the real and complex numbers are fields because we've established exactly the properties required. Then he goes on to multidimensional fields and proves the same properties (commutativity and associativity and identities) in F^n where F is any arbitrary field, so could be either the real or the complex numbers.
Then he moves onto vectors and then onto linear maps. It's literally chapter 3 before you see [ ] notation or anything that looks like a matrix, and he introduces the concept of matrices formally in terms of the concepts he has built up piece by piece before.
Axler really does a great job (imo) of this kind of bridge building, and it is absolutely rigorous each step of the way. As an example, he (famously) doesn't introduce determinants until the last chapter because he feels they are counterintuitive for most people and you need most of the foundation of linear algebra to understand them properly. So he builds up all of linear algebra fully rigorously without determinants first and then introduces them at the end.
[1] eg he proves that there is only one zero and one "one" such that A = 1*A and A = 0 + A.
A lot of people think Gil Strang was that. Certainly his 18.06SC lecture series is fabulous.[1]
I really like Sheldon Axler and he has made a series of short videos to accompany the book that I think are wonderful. Very clear and easy to understand, but with a little bit more of the intuition behind the proofs etc.
This, betterexplained, ritvikmath, SeeingTheory will give you a very solid math background(I think they are better than 90% of the intro math classes in colleges).
> Isn‘t there anybody close to the Feynman of Linear Algebra?
No. The subject is too young (the first book dedicated to Linear Algebra was written in 1942).
Since then, there have been at least 3 generations of textbooks (the first one was all about matrices and determinants). That was boring. Each subsequent iteration is worse.
What is dual space? What motivates the definition? How useful is the concept? After watching no less than 10 lectures on the subject on youtube, I'm more confused than ever.
Why should I care about different forms of matrix decomposition? What do they buy me? (It turns out, some of them are useful in computer algebra, but the math textbook is mum about it)
My overall impression is: the subject is not well understood. Give it another 100 years. :-)
Gilbert Strang (already mentioned by fellow commenters).
> The subject is too young
"The first modern and more precise definition of a vector space was introduced by Peano in 1888; by 1900, a theory of linear transformations of finite-dimensional vector spaces had emerged." (from Wikipedia)
The first book was written in 1942 - it's mentioned explicitly in LADR.
It doesn't mean the concepts didn't exist - they did, Frobenius even built a brilliant theory around them (representation theory), but the subject was defined quite loosely - apparently no one cared to collect the results in one place.
It doesn't even matter much: I remember taking the course in 1974, and it was totally different from what is being taught today.
What? Linear Algebra is easily one of the best understood fields of mathematics. Maybe elementary number theory has it beat, but the concepts that drive useful higher level number theory aren't nearly so clear or direct as those driving linear algebra. It's used as a lingua franca between all sorts of different subjects because mathematicians of all stripes share an understanding of what it's about.
From what you said there, it seems like you tried to approach linear algebra from nearly random directions- and often from the end rather than the beginning. If you're in it for the computation, Axler definitely isn't for you. There are texts specifically on numeric programming- they'll jump straight to the real world use. If you want to understand it from a pure math perspective, I'd recommend taking a step back and tackle a textbook of your choosing in order. The definition of a dual space makes a lot more sense once you have a vector space down.
I sympathize with the person you're responding to a lot more than you.
It's very easy to understand what a dual space is. It's very hard to understand why you should care. Many of the constructions that use it seem arbitrary: if finite vector spaces are isomorphic to their duals, why bother caring about the distinction? There are answers to this question, but you get them somewhere between 1 and 5 years later. It is a pedagogical nightmare.
Every concept should have both a definition and a clear reason to believe you should bother caring about it, such as a problem with the theory that is solved by the introduction of that concept. Without the motivating examples, definitions are pointless (except, apparently, to a certain breed of mathematicians).
I've read something like 100 math textbooks at this point. I would rate their pedagogical quality between an F and a D+ at best. I have never read a good math textbook. I don't know what it is, but mathematicians are determined to make the subject awful for everybody who doesn't think the way they do.
(I hope someday to prove that it's possible to write a good math textbook by doing it, but I'm a long way away from that goal.)
I absolutely see what you're saying with that. I think I'm definitely the target audience of the abstracted definition, but I've long held that every new object should be introduced with 3 examples and 3 counter-examples. But you said it yourself- that's the style pure math texts are written in! Saying that "we" as a species don't have a good understanding of linear algebra is unbelievable nonsense. I can't conceive of the thought process it would take to say that with a straight face. The fact is, 10 separate YouTube lectures disconnected from anything else is just the wrong way to try and learn a math topic. That's going to have as much or more to do with why dual spaces seem unmotivated as the style of pedagogy does.
It's not that we don't have a good understanding of linear algebra at all. It's that we don't understand how to make it simple. It's like a separate technological problem than actually building the theory itself.
I'm not the person you were originally replying to, but I have taken all the appropriate classes and still find the dual space to be mostly inappropriately motivated. There is a style of person for whom the motivation is simply "given V, we can generate V* and it's a vector space, therefore it's worth studying". But that is not, IMO, sufficient. A person the subject can't make sense of that understanding the alternative: not defining it, and discarding it, and ultimately why one approach was stolen over the others.
I think in 50 years we will look back on the way pure math was written today as a great tragedy of this age that is thankfully lost to time.
My arguments is: whoever understands linear algebra has to be able to explain it to anyone having a sufficient math background. The failure to do so signals the lack of understanding. Presenting it as a pure algebraic game cleverly avoids the problems of interpretation, but when you proceed to applications, it leads to conceptual confusion.
One "discovery" I made while learning LA is that most applications are based on mathematical coincidence. Namely, the formula for the scalar product of 2 vectors is identical to the formula for the correlation between 2 series of data. There's no apparent connection between the "orthogonality" in one sense and "orthogonality" (as a lack of correlation) in another.
I submit that not only the subject is not well understood, but even the name of the subject is wrong. It should be called "The study of orthogonality". This change of perspective will naturally lead to discussion of orthogonal polynomials, orthogonal functions, create a bridge to representation theory and (on the other end) to the applications in data science. What say you? :-)
I think that "when you proceed to applications" is the issue there. Applications where? For applications in field theory, the spatial metaphor is exactly incorrect! For applications in various spectral theories, it's worse than useless.
What you say regarding the seeming coincidental nature of "real world" applications is basically correct (with correlation specifically there's some other stuff going on, it isn't that surprising, but in general), but unavoidable for any aspect of pure mathematics. Math is the study of formal systems, and the real world wasn't cooked up on a black board. If we can demonstrate that some component of reality obeys laws which map onto axioms, we can apply math to the world. But re-framing an entire field to work with one specific real world use (not even imo the most important real world use!) is just silly.
I love the idea of encouraging students early on to look at different areas of math and see the connections. But linear algebra is connected in more ways to more things than just using an inner product to pull out a nice basis. Noticing that polynomials, measurable functions, etc are vectors is possible without reframing the entire field, and there are lots of uses of linear algebra that don't require a norm! Hell representation theory only does in some situations.
You start with a controversial statement ("Math is the study of formal systems"), and the rest follows. Not everyone agrees with this viewpoint. I think algebraic formalization provides just one perspective of looking at things, but there are other perspectives, and their interplay (superposition) constitutes the "knowledge". Focusing just on albegraic perspective is a pedagogical mistake IMO.
Some say it's all a kind of hangover from bourbakinism though.
(Treating math as a game of symbols is equivalent to artificial restriction to use just 1% of your brain capacity IMO)
What do you mean with correlation and orthogonality? Like with signal processing, you might calculate the cross-correlation of two signals, and it basically tells you at each possible shifted value, to what extent does one signal project onto the other (so what's their dot product). Orthogonality is not invariant under permuting/shifting entries in just one of the vectors, obviously (e.g. in your standard 2-d arrows space, x-hat is orthogonal to y-hat but not x-hat).
Linear algebra studies linearity, not (just) orthogonality. Orthogonality requires an inner product, and there isn't a canonical one on a linear structure, nor is there any one on e.g. spaces over finite fields. Mathematics, like programming, has an interface segregation principle. By writing implementations to a more minimal interface, we can reuse them for e.g. modules or finite spaces. It also makes it clear that questions like "are these orthogonal" depend on "what's the product", which can be useful to make sense of e.g. Hermite polynomials, where you use a weighted inner product.
> Namely, the formula for the scalar product of 2 vectors is identical to the formula for the correlation between 2 series of data. There's no apparent connection between the "orthogonality" in one sense and "orthogonality" (as a lack of correlation) in another.
Of course there is. Covariance looks like an L2 norm (what you're calling the scalar product) because it is an L2 norm. They're the exact same object.
Why should it buy you something is the real question.
You don't need to understand it the way the "initial" author thought about it, should that person had given it more thoughts...
History of maths is really interesting but it's not to be confused with math.
Concepts are not useful as you think about them in economic opportunity case. Think about them as "did you notice that property" and then you start doing math, by playing with these concepts.
Otherwise you'll be tied to someones way of thinking instead of hacking into it.
I know more math than the average bear, but I think the parent has a point even if I don’t totally agree with them.
Take for instance the dual space example. The definition of it to someone who hasn’t been exposed to a lot of math seems fine but not interesting without motivation — it looks just another vector space that’s the same as the original vector space if we’re working in finite dimensions.
However, the distinction starts to get interesting when you provide useful examples of dual spaces. For example, if your vector space is interpreted as functions (for the novice, even they can see that a vector can be interpreted as a function that maps an index to a value), then the dual space is a measure — a weighting of the inputs of those functions. Even if they are just finite lists of numbers in this simple setting, it’s clear that they represent different objects and you can use that when modeling. How those differences really manifest can be explored in a later course, but a few bits of motivation as to “why” can go a long way.
Mathematicians don’t really care about that stuff — at least the pure mathematicians who write these books and teach these classes — because they are pure mathematicians. However, the folks taking these classes aren’t going to all grow up and be pure mathematicians, and even if they are, an interesting / useful property or abstraction is a lot more compelling than one that just happens to be there.
Your post represents a common viewpoint, but I don't agree with it. I'm a retired programmer trying to learn algebra for the purposes of education only. I am not supposed to take an exam or use the material in any material way, so to speak. I'd like to understand. Without understanding motivations and (on the opposite end) applications I simply lose interest. I happen to have a degree in math, and I know for the fact that when you know (or can reconstruct) the untuition behind the theory - it makes a world of a difference. If this kind of understanding is not a goal, then what is?
BTW, by "buying" I din't mean that it should buy me a dinner, but at least it's supposed to tell me something conceptually important within the theory itself. Example: in the LADR book, the chapter on dual spaces has no consequences, and the author even encourages the reader to skip it :).
> Why should I care about different forms of matrix decomposition? What do they buy me?
A natural line of questioning to go down once you're acquainted with linear maps/matrices is "which functions are linear"/"what sorts of things are linear functions capable of doing?"
It's easy to show dot products are linear, and not too hard to show (in finite dimensions) that all linear functions that output a scalar are dot products. And these things form a vector space themselves, the "dual space" (because each element is a dot-product mirror of some vector from the original space). So linear functions from F^n -> F^1 are easy enough to understand.
What about F^n -> F^m? There's rotations, scaling, projections, permutations of the basis, etc. What else is possible?
A structure/decomposition theorem tells you what is possible. For example, the Jordan Canonical Form tells you that with the right choice of basis (i.e. coordinates), matrices all look like a group of independent "blocks" of fairly simple upper triangle matrices that operate on their own subspaces. Polar decomposition says that just like complex numbers can be written in polar form re^it, where multiplication scales by r and rotates by t, so can linear maps be written as a higher dimensional multiplication/scaling and orthogonal transformation/"rotation". The SVD says that given the correct choice of basis for the source and image, linear maps all look like multiplication on independent subspaces. The coordinate change for SVD is orthogonal, so another interpretation is that roughly speaking, SVD says all linear maps are a rotation, scaling, and another rotation. The singular vectors tell you how space rotates and the singular values tell you how it stretches.
So the name of the game becomes to figure out how to pick good coordinates and track coordinate changes, and once you do this, linear maps become relatively easy to understand.
Dual spaces come up as a technical thing when solving PDEs for example. You look for "distributional" solutions, which are dual vectors (considering some vector space of functions). In that context people talk about "integrating a distribution with test functions", which is the same thing as saying distributions are dot products (integration defines a dot product) aka dual vectors. There's some technical difficulties here though because now space is infinite dimensional, and not all dual vectors are dot products, e.g. the Dirac delta distribution delta(f) = f(0) can't be written as a dot product <g,f> for any g, but it is a limit of dot products (e.g. with taller/thinner gaussians). One might ask whether all dual vectors are limits of dot products and whether all limits of dual vectors are dual vectors (as limits are important when solving differential equations). The dual space concept helps you phrase your questions.
They also come up a lot in differential geometry. The fundamental theorem of calculus/Stokes theorem more-or-less says that differentiation is the adjoint/dual to the map that sends a space to its boundary. I don't know off the top of my head of more "elementary" examples. It's been like 10 years since I've thought about "real" engineering, but roughly speaking, dual vectors model measurements of linear systems, so one might be interested in studying the space of possible systems (which, as in the previous paragraph, might satisfy some linear differential equations). My understanding is that quantum physics uses a dual space as the state space and the second dual as the space of measurements, which again seems like a fairly technical point that you get into with infinite dimensions.
Note that there's another factoring theorem called the first isomorphism theorem that applies to a variety of structures (e.g. sets, vector spaces, groups, rings, modules) that says that structure-preserving functions can be factored into a quotient (a sort of projection) followed by an isomorphism followed by an injection. The quotient and injection are boring; they just collapse your kernel to zero without changing anything else, and embed your image into a larger space. So the interesting things to study to "understand" linear maps are isomorphisms, i.e. invertible (square) matrices. Another way to say this is that every rectangular matrix has a square matrix at its heart that's the real meat.
The thing is, you can teach linear algebra as a gateway to engineering applications or as a gateway to abstract algebra. The second one will require a hell of a lot more conceptual baggage than the first one. It’s also what the book is geared towards.
It is also intended for people who know something about the trade; it isn’t “baby’s first book on maths”. (Why can you graduate high school, do something labelled “maths” for a decade, and still be below the “baby’s first” level, incapable of reading basically any professional text on the subject from the last century? I don’t know. It’s a failure of our society. And I don’t even insist on maths being taught—but if they don’t teach maths, at least they could have the decency to call their stupid two-hundred-year-old zombie something else.)
That conceptual baggage is not useless even in the applied context. For example, I know of no way to explain the Jordan normal form in 19th-century “columns or numbers” style preferred by texts targeted at programmers. (Not point at, not demonstrate, not handwave, explain—make it obvious and inevitable why such a thing must exist.) Or the singular value decomposition, to take a slightly simpler example. (Again, explain. You task, should you choose to accept it, is to see a pretty picture behind it.) And so on.
Again, you can certainly live without understanding any of that. (To some extent. You’ll have a much harder time understanding the motivation behind PageRank then, say. And ordinary differential equations, classical mechanics, or even just multivariable calculus will look much more mysterious than they actually are.) But in that case you need a different book and a different teacher.
I like the free course on linear algebra by Strang’s Ph.D student Pavel Grinfeld. It's a series of short videos with online graded exercises. Most concepts are introduced using geometric vectors, polynomials, and vectors in ℝⁿ as examples. https://www.lem.ma/books/AIApowDnjlDDQrp-uOZVow/landing
> Isn‘t there anybody close to the Feynman of Linear Algebra?
That would probably be Gilbert Strang.
While, as a maths person I would prefer a bit more rigour, his choice of topics and his teaching skill make his the most outstanding introductory course I have seen.
I would run a mile from any course that disrespects determinants. And that includes Axler's!
Also I wish more Linear Algebra courses would cover Generalized Inverses.
As mentioned, the book was intended to be a "second course" in linear algebra. I personally self-studied out of the 3rd edition of Axler, and found it very helpful for understanding exactly what is going on with all the matrix computations we do.
Plus, the same can be said about artists. After all, it's all self-aggrandization, and art is not made to be simple or intuitive.
I actually found the book quite intuitive and helpful in understanding linear algebra. It does explain a lot of the intuition for many definitions, as well as mathematical techniques.
It's easy when presented with new things that you don't understand to reflexively dismiss them, but the ideas here are quite solid. It's also a textbook which aims to introduce students to a slightly higher level of mathematical thinking.
I self studied from this book as an undergrad. I was an EE major and took linear algebra as part of the mandatory ODEs class but didn’t “get it.” At a certain point, it became clear that if I wanted to learn the more advanced applied math I was interested in studying, I needed to really understand linear algebra. I thought Axler was great at introducing both the material and teaching me how to prove things rigorously. The month or so I spent that summer reading that book made the rest of the math I took in undergrad trivial.
Guess I need to re-learn it again.