Hacker News new | ask | show | jobs
by salty_biscuits 2854 days ago
I really violently oppose this characterization of ML as "just" curve fitting, as if curve fitting is some simple solved problem. It seems like there is a ignorance about issues relating to model selection, which is an essential part of curve fitting. What complexity of model does the data support? Can you keep a distribution over structures that allows uncertain parts of the model to be interrogated? These are the parts of the fitting equation that allow something like "experiments" to be automatically generated as part of the curve fitting.
2 comments

Not the same kind of experiment. An experiment in the scientific sense tweaks the process that generates the data, not the interpretation of the data. There is an inspiration / hypothesis creation step between old data and new experiment.

Main differences: A hypothesis is sorta kinda like your model's coefficients, but more generally applicable. And you have no feedback loop between model coefficients and input data.

So yeah, you are doing very sophisticated curve fitting. It is useful alright, it's just not very much like science.

No, it's the same. It is just about having access to control variables.
What Chomsky is saying is that the control variables don't exist until you create them because the most telling things don't happen until you have a specific hypothesis and make them happen to test the hypothesis.
I disagree. What he is saying is that there is a special rule for languages that he doesn't think you would get at without an enormous amount of data. So a passive learning algorithm wouldn't uncover this structure in a reasonable amount of time or data (I guess it is poor sample efficiency he is worried about). A learning algorithm that has a distribution over it's own internal model of language would be able to ask questions that minimize the uncertainty of the model.
But what you describe is still curve fitting. I say this in spite of some expertise in ML myself. There are some parts of ML that are not fall in the curve fitting family but they are still a small part, for example Markov logic network, some parts of reinforcement learning.

What you are saying is curve fitting with good predictive ability is not trivial, and that is indeed true.

Markov Logic Networks are still about finding coefficients for a probability distribution over some process. My opinion is that there is only curve fitting. There is data and a minimum complexity model that can reproduce the data with minimum error. So do you really believe that there are physical processes where this approach will fail?
There is more to Markov Logic than estimating parameters. That's the "unification" part of the Markov logic, the analogue of https://en.wikipedia.org/wiki/Unification_(computer_science)