Hacker News new | ask | show | jobs
by Dylan16807 2056 days ago
It means that you're not just adding up some simple curves, you're taking your variable to extremely high powers, in this case all the way up to x^12. The more orders/powers you add into a fit equation, the more it's going to get artificially closer inside your data window (as you sledgehammer it into a nearly arbitrary shape). And the more it's going to immediately shoot to infinity the moment it gets out of your data window, because those extreme powers of your variable are all fighting each other and have no real connection to the underlying data; what physical process is causing x^10 and x^11 and x^12 curves or even x^5 and x^6?

See the last example here: https://xkcd.com/2048/

1 comments

> or even x^5 and x^6?

Highest I know is Lighthill's (aptly called) eighth power law which says that the sound power created by a turbulent flow scales with the eight power of the characteristic turbulent velocity: https://en.wikipedia.org/wiki/Lighthill%27s_eighth_power_law, anyone knows something higher?

Another is a simplified model of the interatomic force https://en.wikipedia.org/wiki/Lennard-Jones_potential that is F = a/d^12 - b/d^6

The difference is that in my example and in your example there are only a few coefficients to tweak to fit the data. So the shape of the curve if fixed and it is very difficult to overfit the data.

In the paper they used a full polynomial of degree 12, that has 13 coefficients to tweak and it is very easy to get weird shapes.

Nice example! But yes, I agree. My field is certainly not astronomy but... that method to remove noise seems extremely weird. XKCD-level joke weird. Even in the rebuttal arxiv paper cited here where they use a 3rd degree poly to remove the noise it seems... not to fit very well? Seems strange to use a random fit without any guesstimate of the underlying cause/model of the noise.
Sorry for the very late response...

> they use a 3rd degree poly to remove the noise it seems... not to fit very well

They are not trying to fit the noise, they are trying to fit the hidden smooth signal that if hidden by the noise. In some cases it is difficult to make a formula for the real signal, so you can approximate it locally with a polynomial.

The idea is that after you subtract the smooth part, you get only the noise. So you can calculate the expected noise level.

And if the "noise" has a big peak, you can guess there is something strange, like a big absorption line.

See also my other comment: https://news.ycombinator.com/item?id=24985680