Hacker News new | ask | show | jobs
by graycat 2879 days ago
Might look at the thread

Foundations Machine Learning (bloomberg.github.io)

at

https://news.ycombinator.com/item?id=17519591

There machine learning (ML) is basically a lot of empirical curve fitting. The context is usually with a lot of data, thousands of variables, millions or billions of data points, observations, pairs of values of thousands of independent variables and the value of the corresponding dependent variable. The work is all a larger, more data, version of: You have a high school style X-Y coordinate system and some points plotted there. So, you want to find values for coefficients a and b so the line

y = ax + b

fits the points as well as possible. But, you can do variations, try to fit, say,

log(y) = a sin(x) + b

Or replace log or sin with any functions you want and try again.

The logic, rational support, is essentially as follows: So, take, say, 1000 x-y pairs. Partition these into 500 training data and 500 test data. Find the best fit you can, using whatever fits, to the training data. Then take the equation and see how well it fits the test data. If the fit of the test data is also good, then that is your model.

Now you want to apply the model in practice, apply the model to data did not see in the given 1000 points. So for the application, will be given a value of x, plug it into the equation, and get the corresponding value of y. That's what you want -- maybe the value of y gives you Y|N for ad targeting, Y|N cancer, what MSFT will be selling for next month, what the revenue will be for next year, etc.

The rational, logical justification here is an assumption (which should have some justification from somewhere) that the x you are given and the y you want for that value of x is sufficiently like the x-y values you had in the original 1000 points.

Okay. Empirical curve fitting to a lot of data to make a predictive model, that is found with training data, tested with test data, and applied where the given data in the application is like the data used in the fitting.

The OP mentions that some people believe that to make progress to real machine intelligence, need more math than what I outlined.

My guess is that to make that intended progress, for all but some tiny niche cases, first need some much more powerful and quite different ideas, techniques, etc. than in the curve fitting ML I outlined.

Yes, there is a chance that with lots of data from working brains and lots of such empirical fitting we will be able to find some fits that will uncover some of the workings of the brain crucial for real intelligence. Uh, that's a definite maybe!

But there is a lot more to what can be done to build predictive models than such curve fitting, empirical or otherwise. I outlined some such in the thread that I referenced above.

So, for the question in the OP, what math? Well, if want to pursue directions other than the empirical curve fitting in the Bloomberg course I referenced above, my experience is -- quite a lot. For the education, start with a good undergraduate major in pure math. So, cover the usual topics, calculus, abstract algebra, linear algebra, differential equations, advanced calculus, probability, statistics. Then continue with more in algebra, analysis, and geometry.