Hacker News new | ask | show | jobs
by karolkozub 2034 days ago
I love the article, but I don't agree with the premise that machine learning equals neural nets. In my understanding machine learning is a very broad term that just as well could be applied to the polynomial model if the constants were optimized algorithmically. I feel like the presented argument is more for transparent vs opaque models rather than machine learning vs something else. Also one could argue that the polynomial model is just a perceptron[0].

[0]: https://en.wikipedia.org/wiki/Perceptron

7 comments

The machine learning course at my university starts out with polynomial regression and estimators, statistics of classification, etc.. Neural networks are only one tool in a large toolbox.

But they are all the rage and it is no surprise that a lot of people want to play with them.

Cynically, neural networks are easier as you don't really have to think about your model. Give some examples with some classes and you're done. Or give examples of one class and let the neural net generate new ones. Doing away with the abstraction beforehand is an enticing prospect.

> Cynically, neural networks are easier as you don't really have to think about your model. Give some examples with some classes and you're done.

This way of thinking about it leads directly to things like statistical redlining.

It's also not specific to neural networks. I take a similar approach with logistic regression. Except that I like to replace the "and you're done" step with, "and you're ready to analyze the parameters to double check that the model is doing what you hope it is." Even when linear models need some help, and I need to do a little feature engineering first, I find that the feature transformations needed to get a good result are generally obvious enough if I actually understand what data I'm using. (Which, if you're doing this at work, is a precondition of getting started, anyway. IMNSHO, doing data science in the absence of domain expertise is professional malpractice.)

There is no, "and you're done" step, outside of Kaggle competitions or school homework. Because machine learning models in production need ongoing maintenance to ensure they're still doing what you think they're doing. See, for example, https://research.google/pubs/pub43146/

That's an excellent approach -- and how I try to introduce people to NNs.

NNs are just polynomial regression with polynomial activations; and piece-wise linear regression with relu activations (etc.).

A NN is just a highly parameterized regression model -- for better, or worse.

That was an eye-opener for me.

I had always thought of neural nets in terms of the massive connected graph, that in my head was somehow behaved like a machine.

When I realized in the end its just a representation of a massive function, f:Rm->Rn, which needs to fitted to match inputs and outputs.

I know this is not precisely correct and glosses over many, many details - but this change in viewpoint is what finally allowed me to increase the depth of my understanding.

It's unclear that there is such a thing as an NN, and in any case, that it is graph-like.

What are the nodes and edges?

There is a computational graph which corresponds to any mathematical function -- but it is not the NN diagram -- and not very interesting (eg., addition would be a node).

NNs are neither neural nor networks.

> Cynically, neural networks are easier as you don't really have to think about your model. Give some examples with some classes and you're done. Or give examples of one class and let the neural net generate new ones. Doing away with the abstraction beforehand is an enticing prospect.

If you're trying to solve a well understood business problem sure but my issue with this is that you pigeonhole yourself and your solution. I'm much more interested in understanding the model than doing the implementation because that allows you to build on top of what you get out of the box in a framework for example. It's like learning React before learning Javascript. It might be a good short term solution but long term it certainly isn't.

Oh, I was not defending neural networks. This was the cynical sales pitch for the case where you don't want to employ mathematicians or computer scientists, but just throw code and computational resources at the problem.
But isn't that an important part of the value of neural networks? Mathematicians are expensive so we'd like a computer to make a model for us, just like drivers are expensive so we want self-driving cars.
The issue with that is NN fail in some really interesting ways so you still need a lot of effort to get a robust solution. Remember, after some serious investments by many organizations self driving cars are still in development. At the same time a few people have demonstrated a basic system that seems close without nearly that much investment. Unfortunately, the difference between a demo and working solution can be several orders of magnitude.
> It might be a good short term solution but long term it certainly isn't.

It is only a temporary solution - unless it works.

https://www.youtube.com/watch?v=pY7nx5Z6Kzo

Could a person with ML experience come up with this solution? Yes! Would his ML experience help him come up with this solution compared to someone who just learned numerical methods and automatic control theory? No. This isn't an ML solution.

Just because something is taught in an ML course doesn't mean that it is ML. It is pretty common for physics classes to teach maths and for chemistry classes to teach physics for example.

So if something is taught in ML class but also in statistics class then it is statistics and not ML. If something is taught in ML class but also in a numerical methods class then it is numerical methods and not ML.

Well... I guess most people equal ML with AI and use these terms interchangeably.

If you just replace ML with AI everywhere in this article it is going to make sense.

The article has other problems, one being the main premise.

The problem isn't to drive a car around track (which is what the polynomials did) but rather write a program that can figure out how to drive a car without you knowing how to solve it.

Well that depends on your definition of AI. Which isn't well defined. We call AI what we perceive as "magic". Black box algorithms have a higher chance of being perceived that way (e.g. neural nets). When you get some insight into how an algorithm works (easier for transparent box algos, but same holds for black box algorithms), you start to see it less and less as "magic", and, consequently, you're less likely to refer to it as an (artificial) intelligence. Because ultimately, that's what we mean by intelligence -- magic. When we say that something is intelligent, we liken it to ourselves: it evokes a sense of identification. It all comes back to a sense of humans being fundamentally separate from "the other" (computers in this case). If we saw the mathematical models and algorithms as just that, we wouldn't call them AI. Also, if we didn't think of our intelligence as more than the behaviour of our biological computer, we wouldn't be enchanted by the concept of non-biological systems mimicking some of our behaviour.
A professor once told in class "when it works and you don't understand why, it's called AI; when you do, it's called algorithm"
I disagree.

We don't find these systems intelligent because, on inspection, they arent.

We are intelligent. Not "magically", but actually nevertheless.

Our intelligence, and that of dogs (, mice, etc.) consists in the ability to operate on partial models of environments; dynamically responsive to them; and to skilfully respond to changes in them.

This sort of intelligence requires the environment to physically reconstitue the animal in order to non-cognitively develop skills.

It is skillful action we are interested in; and precisely what I missing in naive rule-based models of congition.

You provided an illustration of "magic". It's important to realise that you don't need a complex algorithm to produce complex behaviour (see Stephen Wolfram and his work on cellular automata).
In my understanding AI is an even broader term and means "any solution that imitates intelligent behavior". E.g. expert systems which are pretty much a bunch of if-then rules are also considered AI.
It's my understanding as well, many things that a modern programmer thinks in term of "computation" were once considered to be "AI". Lisp and Prolog were "AI", even the A* algorithm is still considered a rudimentary form of "AI" in textbooks just because it uses heuristics. There's a joke that says "every time AI researchers figure out a piece of it, it stops being AI" [0].

It's why I use "AI" and "ML" interchangeably although I know it's technically incorrect - the formal definition doesn't match what people are currently thinking.

[0] https://en.wikipedia.org/wiki/AI_effect

There have traditionally been different approaches and definitions for AI. Some emphasize behaviour while others emphasize the logic behind the behaviour. (In some sense, while expert systems of course were an attempt at getting practical results, they might also have been an attempt to implement what was seen as human reasoning, while e.g. black box machine learning could be more about just getting the behaviour we want.) Some approaches view agents as intelligent if their action resembles humans or other beings that we consider intelligent, while other approaches are merely interested in whether they perform well at a specified task, perhaps more so than humans.

So yes, "any solution that imitates intelligent behaviour" is probably right, but with nuances with regard to what that actually means.

That's not symbolic AI though. That's only statistical methods. The statistical methods are all the rage now, but explainable AI that can reason is an important area of computer science (and research) and uses formal methods.

Edit: yeah, you can downvote this, but current AI research splits right along this line, whether it's symbolic or statistical. Some AI courses will use NNs, others will use Prolog and ASP. You can't just dismiss a whole field of research by reducing AI to statistical methods.

"Expert systems" were the hot research area in AI prior to machine learning (data driven methods, basically). Old methods and problems from that era like automated reasoning still have some research and applications going on, but aren't remotely as big an area as machine learning.
When I see "symbolic AI" I immediately think of Gary Marcus and immediately feel disdain towards the topic because of his behaviour on Twitter and other places.
I don't know the dude. I "only" know that my field of research is deductive reasoning in interactive applications and that this area falls under "Logic Programming" and LP is an area of AI.

I know that AI researchers are usually a bit dismissive about the other area. I don't like statistics either. Reducing the whole of AI research to statistical approaches (and NNs are one of those) is disingenious and dismisses hundreds of researchers doing important work.

You may not want to have rule-based image recognition, but if your car decides to run over somebody, I feel we better have an explanation for this behaviour based on reasoning and logic.

I don’t think anyone is dismissing symbolic AI. As far as I can see, it’s just not beating current SOTA results of NNs? It’s not really about ideology, it’s about what currently has superior performance. Model interpretability is not always a requirement.
The author may have implemented ML when they optimized their polynomial constants:

> If I was developing a racing game using this as the AI, I’d not just pick constants that successfully complete the track, but the ones that do it quickly.

If they wrote code that automatically picked constants that successfully completed the track quickly, (even something as simple as sorting the results by completion time), then that's reinforcement learning.

I agree, machine learning can certainly be over transparent models and classic models can certainly be non transparent. I tend to think of machine learning as any method which optimizes not only the model parameters, but also the model structure in a single step. Though then again the latter are just parameters of a more abstract model. So its all always optimization in the end.
> Also one could argue that the polynomial model is just a perceptron

One also can argue otherwise [1].

[1] https://matloff.wordpress.com/2018/06/20/neural-networks-are...

I came here to state the same.

I am not sure when we changed the terms, but back in the day, this would happily fall into machine learning. As he mentioned, if you want a good driver you would execute thousands of experiments to pick a good set of parameters

As soon as we recognize plain old regression as machine learning, then we start to see "averages" as models of systems and how practically useful could that be?
I think you're being facetious, but on the off-chance you're not, and for the benefit of others: averages are incredibly practically useful for modeling systems. Parameter estimation (which generalizes averages and applies to other distribution features like variance) is a foundational modeling methodology. It's useful for both understanding and forecasting data. Measures of central tendency are nearly always good (if obviously imperfect) models of systems.

Here is a trivial example: one of the best ways of modeling timeseries data, both in and out of sample, is to naively take the moving average. This is a rolling mean parameter estimate on n lagged values from the current timestep. Not only is this an excellent way of understanding the data (by decomposing it into seasonality, trend and residuals), it's a competitive benchmark for future values. The first step in timeseries analysis shouldn't be to reach for a neural network or even ARIMA. It should be to naively forecast forward using the mean.

You might be surprised at how difficult it is to beat that benchmark with cross-validation and no overfitting or look-ahead bias.

Thank you for your fabulous response. I hope my provocative comment wasn't in bad humor, disrespectful or trolling. I too love averages and regressions. Thank you for proudly defending these marvelously simple and powerful tools.
Well, actually working with "averages" as baselines before you start experimenting with more complex ML models is a good habit.

Sure, they are dummy regressors [1], but they can be so useful for proving that your whatever ML model you choose is at least better than a dummy baseline. If your model can't beat it, then you need to develop a better one.

They can even be used as a place-holder model so you can develop your whole architecture surrounding it, while another teammate is iterating over more complex experiments.

You could also settle in for a moving average process as a first model in a time-series [2], because they are easy to implement and simple to reason about.

Never under-estimate the power of an "average".

[1] https://scikit-learn.org/stable/modules/generated/sklearn.du... [2] https://en.wikipedia.org/wiki/Moving-average_model