Hacker News new | ask | show | jobs
by nerdponx 656 days ago
This is pretty much the core principle underlying modern machine learning. More parameters means more faithful fit for the data, at the cost of over-fitting and generalizing poorly on unseen data from outside the range of data that was used to tune the parameters. In this particular application, we aren't that worried about overfitting because we know the actual function used to compress the data in the first place, so we know that our decompression function is "correct" and we know the range of the data. So we can keep adding parameters to reduce reconstruction error. Meanwhile in applied ML and stats, cubic and even quadratic models should be used and interpreted only with extreme caution and detailed knowledge of the data (how it was prepared, what the variables mean, what future data might look like, etc).
1 comments

This also seems to a difference between interpolation and extrapolation. The table doesn't just fit a polynomial to theta between 0 and pi/8 and expect you to extrapolate for theta > pi/8. That would have catastrophic results. It has always seemed to me like one of the big problems with ML is knowing whether a given inference is an interpolation or an extrapolation.
In that sense, extrapolation should never be used in “production”. At best it is for exploration.

One characteristics of ML is that this distinction often is not clear. (Hallucination, generalization, etc.)

Interpolation would require deriving from the resultant and extrapolation would guarantee no inference is how I make sense of it