Unfortunately, I think statistics will continue to be under-appreciated. All of my mathematics professors in undergraduate said that statistics and linear algebra are the two most useful fields of math to know and that dedicating time to studying them will pay dividends. It still surprises me when I apply for jobs today how few places, even financial or scientific firms, distinguish between statistics and mathematics in their application forms.
People want to be become machine learning engineers because it's the sexy thing right now, but they don't want to learn the necessary statistics/linear algebra/optimization necessary for the roles. In my experience, these "AI" and "datascience" programs are largely just cash-grabs at most universities. I don't doubt that M.I.T.'s will be rigorous, but I'm largely skeptical of how useful these programs actually are.
I don't know. I studied a lot of Machine Learning in UC Berkeley (even though my specialization was on systems since I like it more) and it was all very rigorous linear algebra, probability theory, optimization, signal processing, information theory, statistics, algorithm analysis etc... Sure we also took classes about designing heuristics, data visualization etc but they were no where near as serious/hard as other classes, so students focused on other classes. Pretty much all students who were serious about ML took upperdivision Linear Algebra, Abstract Algebra and/or Analysis classes. We all took EE, Stats, CS, Data Science etc... and saw ML from bunch of different aspects (e.g. EE perspective being more signal/information -esque, or CS perspective more computational (kernels!) etc...). I have no reason to believe MIT will be any less rigorous. I think most (almost all?) random ML intros online are filler courses without much relation "Actual" (?) ML, but I have no reason believe something from MIT, Berkeley, Stanford, CMU etc will be like that.
Yup. Statistics does continue to be under-appreciated, nowhere more than among dudebros who read up about ML for no reason except that's where the money seems to be.
MIT alum here. I expect this new department to be a gross embarassment compared to the rest of campus initially. Then they'll enforce standards to bring it into line, which will suppress enrollment, and then the department will be merged into the CS department. Won't be the first time.
Generally, the traditional university model has been adding "<insert job title> program" for a long time, whether or not they have a useful curriculum to put students through.
There are office management and msft liscencing degrees. The most popular degree (at least where I live) for both undergraduate and graduates are various generic "business degrees."
In practice, universities have curriculums for accounting, finance and ecocomics.
There is no curriculum for social media or digital growth hacking. There are job titles. There are students who want to enroll. Employers are asking for these graduates (in theory, graf salaries for these are low). Politicians are willing to fund them...
Does a bachelors of social media businessing serve a student 10, 20, 30 years later?
So, will these ml/ai these programs really produce better graduates than maths, statistics or CS programs? Dunno. I'll wager that they're a whole lot better than business stuff degrees.
I'd actually wager that MIT will put students through decent statistics classes. Hopefully they'll also have them write a decent amount of code too.
It has always seemed to me that it is correct distinction. Although there exists a branch of mathematics called mathematical statistics, by its very nature statistics is more like physics, in that it is trying to develop efficient methods of getting information about the objective world based on a certain kind of observations and measurements; and, of course, just as theoretical physics, statistics is highly mathematical, which often creates confusion regarding the actual subject of investigation.
I don't know what job postings you are seeing but the vast majority of Data Scientist/ML Engineer postings at large tech companies that I've seen explicitly mention Statistics as a requisite skill.
Sure, and most of time they do not vet the candidate during the interview. Even if they do, they ask basic questions about normal distributions and other basic concepts.
How important is optimization? Like a lot of engineers I know, I took a few required courses in undergrad on linear algebra and stats. But I've never studied optimization. And I see it come up all the time on lists of central theory for ML & AI. . .
Is there a classic textbook on the subject? Are there any free online courses that are considered good?
Optimization as a tool is important and widely used, but... almost everything grabbing headlines uses some form of SGD + Momentum. Very little of the actual progress comes from better optimization.
Optimization can be pretty useful. Most stats / ML problems are posed as minimizing a function subject to some constraints.
Depending on your problem, you might be able to exploit special structures to solve problems faster than just doing gradient descent. If you know linear algebra and stats, you'll be fine getting through an optimization book.
Boyd's book is canonical at this point, but might be hard to get through. Before you get to actually optimizing anything, you need to make your way through some chapters on convex analysis with little application.
>>People want to be become machine learning engineers because it's the ... thing right now, but they don't want to learn the necessary statistics/linear algebra/optimization necessary for the roles.
Most people who will do these jobs, will stitch libraries into producing an application, like every other programming job. They will need passing knowledge of things, but only make things work.
This is for the same reasons why anyone build an web app is not writing their own TCP/IP stack and their own operating system.
As a matter of fact I wouldn't be surprised, If most people who are claiming to do AI are just writing SQL queries to get Averages and means.
In the hey days of Big data craze, people were using Hadoop and Pig to deal with files a few kilobytes in size, and calling it 'Big Data'.
I don't doubt it and I don't doubt that M.I.T. will create an intensive AI college. My point was that a lot of universities, even distinguished ones, are recognizing that there's a real demand and hype for "AI/Data Science" degrees and in an effort to maximize enrollment and appeal they often minimize the mathematical and statistical requirements.
I don't believe that you need an advanced degree to become a component ML engineer, but the math/stats is necessary pre-requisite and these pre-reqs are often poorly defined. At my college, the only pre-req to the graduate-level ML course was the freshman level intro to stats class and multivariable calculus. About 50% of the class dropped when they realized they didn't know how to construct Gaussian models or perform convex optimization.
Maybe it's just me, but having gone through all the stats and maths behind ML, it seems like ultimately the less interesting part (though to be fair, algorithm design is similarly uninteresting for similar reasons). We're talking about a lot of very long-in-the-tooth concepts that are still the basis of many, many approaches. They're important, but it's well-worn territory.
The underappreciated parts of AI, in my experience, are more philosophical; about the nature of reasoning and approximating or beating human thought. About autonomous agents, non zero-sum games and ethical, non-maximizing functions. There's a huge overlap with logic (philosophical and mathematical) here, and I haven't seen that really broached at any of these big programs.
It definitely is not just you. I spent my first two years of PhD wrapping my head around the stats and maths commonly used in ML, and realized that mathematically (as "theoretical" ML is practiced today), most answers are already provided in classical work of statisticians and probabilists. There are many fascinating questions of probability theory and statistics, but most have little to do with AI. In fact, in terms of the biggest empirical success story (deep neural networks), there are essentially no theorems providing a solid conceptual leap of understanding. Mikhail Gromov goes one step further regarding the lack of theory for neural networks (https://www.youtube.com/watch?v=g4Wl3Ggho6k), and provides a fascinating overview of his thoughts in:
https://www.ihes.fr/~gromov/category/ergosystems/
I am interested in the points you raise, but also realized that I would not find a good environment for it at MIT in EECS, for reasons that are rather obvious from the article's subtext. As such, the last year or so has been spent in a search for good alternatives in terms of research, and I am slowly finding answers. I am happy to discuss more over email.
Long story short: you are certainly not the only one who thinks that way.
EDIT: added a video link to Mikhail Gromov's actual views for better accuracy.
I see a lot of graduate students focusing on practical uses of ML algorithms as a result of this. A lot of people don't realize that a good portion of the math is already figured out, and that it's in the implementation of these algorithms that they can find more interesting results.
It must be noted, though, that "approximating human thought" is just one direction of investigation - and not the most important one at that; as interesting as it may be, it makes almost as much sense as trying to have computers resemble human brains. In other words, the true AI, when it arrives, will not think like us humans (even if at some level it might pretend that it does).
> ethical
The AI will be just as "ethical" as a computer or an assault rifle.
Sort of my point. Current (by that I mean post-early 20th century) approaches were to mimic what we believe to be human reasoning. That's clearly limited.
> The AI will be just as "ethical" as a computer or an assault rifle.
I think that's reductive. Reasoning is not entirely analytical. There are other implications and concerns to artificial sentience.
You can have linear algebra and stats for the ML engineer classes as coursework. Most sciences have an applied linear algebra class which doesn't require the proof heavy math version. And most colleges require a year of stats. I think though while linear algebra for the sake of linear algebra is good, it isn't strictly necessary for AI and you can learn what you need. If you want to go deeper then go deeper. You could argue you need a class in ODEs and stochastic optimal control theory to understand RL, but you don't really. Maybe in grad school, maybe in some research area inside of RL but not in undergrad. Of course the linear algebra will help you. The best thing would be that the RL, control systems, stats, physicists and other related folks would start speaking the same language.
That's much of the math behind AI, but there's much more to building real things with it. It's a cross-disciplinary field touching:
- Math, as you say
- Software engineering, especially data engineering
- Design - since the the math and engineering enable new kinds of problems to be solved by computers, there's a lot of unexplored design territory
- The domains of all the input and output modalities it touches, like linguistics, computer vision, etc.
- Increasingly, ethics
Sure, each of these topics are already covered by existing university departments, but the boundaries between departments are arbitrary and often limiting anyway. Why not establish a new locus that brings much of the above under one physical and administrative roof?
>It is expected that the Department of Electrical Engineering and Computer Science (EECS), the Computer Science and Artificial Intelligence Laboratory (CSAIL), the Institute for Data, Systems, and Society (IDSS), and the MIT Quest for Intelligence will all become part of the new College; other units may join the College.
I agree. I hope that M.I.T. is able to consolidate these domains in a cohesive manner. The mathematics of ML is super unfriendly because it borrows from so many inter-related domains. You'll end up with notation that uses sigma to represent both summation and covariance within a single expression, or inconsistencies with whether vectors are column vectors by default or row vectors. It makes your understanding of the material dependent on whether you had the same background as the author.
In mathematics, good and consistent notation does help learning stuff; but I noticed that it can also have some sort of "parasitic" effect and interfere with the true understanding of things, and so I find it useful to try and see if I can understand something given a less elaborate notation (or without one at all).
In general, one has to remember that mathematical notation was invented to make calculations (on paper) more efficient and not with the goal of making it easier to understand things.
I wonder if this attitude was a thing when CS was first introduced as a major. At the time it must have looked like someone basically took a tiny part of electrical engineering and made it into it's own field of study.
Historically, it was as likely to be part of the math department as it was to be part of electrical engineering. In part, this was because at the time electrical engineering was much more about analog circuits, power systems, etc.
in germany, you can still see the impact on whether CS emerged out of math or EE in the curriculums. When you have a lot of low-level programming, operating-systems, designing hardware etc. the departement was born in EE. If the emphasis is more on math, they teach you an abstract view on "computers" (the turing machine/lambda calculus as the foundation of computing) and view processors as just an application, it came from the math-departement. It's probably the same in the rest of the world.
This type of condescension is precisely why I am glad I was introduced to the field of ML/pattern recognition/statistical learning (whatever you want to call) through a course from the CS department rather than the statistics department.
I have always felt course offerings from CS are more approachable/amenable to beginners in this area. Maybe anecdotal, but statistics depts. have a gate-keeping attitude , a sense of 'oh you don't know the math already? too bad, we are not for you'.
P.S: Approachable/amenable does not mean it cannot be rigorous or you have to cut short the math. You just built up gradually rather than throwing math books in people's face from the very beginning.
There is no subject under the name "Statistics and Nonlinear Function Approximation" that describes the algorithms involved in NLP. No subject or concept in the name "Statistics and Nonlinear Function Approximation" that describes the history and codification of breaking down an image into thousands of additional images and then performing the "statistics" - there is much more in ML/AI than what you'd like to call that college.
Neural networks are all "nonlinear function approximation". Last I checked, the state-of-the-art in all the fields you mention is some variation of deep neural network.
Not to be despicable. On the contrary, function approximation is a very rich and deep topic.
People want to be become machine learning engineers because it's the sexy thing right now, but they don't want to learn the necessary statistics/linear algebra/optimization necessary for the roles. In my experience, these "AI" and "datascience" programs are largely just cash-grabs at most universities. I don't doubt that M.I.T.'s will be rigorous, but I'm largely skeptical of how useful these programs actually are.