Hacker News new | ask | show | jobs
by s1artibartfast 1045 days ago
If your model is consistently wrong in a statistically predictable way, either your measurement or model is inaccurate.

A 3% chance that never occurs is an inaccurate prediction.

1 comments

Right! Yes absolutely!

It's wrong because the measurements are suggestive of possibility, rather than certain of it.

If we observe an asteroid that with two poor measurements is determined to be headed away from Earth, that's the end. Look no further.

If we observe an asteroid with two poor measurements that has some significant chance of hitting, more and better measurements are made. Then very often those better measurements show it was never actually going to hit anyhow.

But we never would have known without the better measurements, and we never would have devoted more time to making better measurements without a reason to do so.

A 3% chance that never occurs is because that 3% is based on data that's at the limit of what the telescopes can provide, not based upon bad math.

Then what does 3% mean? Surely it means "given the data we have, one in every 33 will hit". Since that empirically doesn't happen, it must be that "the data we have" has a very low prior probability of being real. In other words, the measurement noise seems distributed in a way that over-represents unlikely trajectories.

Hence it seems that it would lead to more accurate predictions if the measurements and their uncertainties were fitted to a model that corrects for the prior probability of observing an asteroid on a given trajectory/making a certain observation.

This discrepancy between distribution of measurement error vs distribution of actual trajectories is what people are wondering about, because it seems interesting to know more about (e.g. "why are certain trajectories less likely?").

Despite all the people coming out of the Woodworks with weird theories, my best one is that the 3% number doesn't take into account their entire measurement process and sampling.

It's is similar to P hacking.

I don't think you understand how this works at all. You might read up on this here if you want to learn more. https://astronomy.stackexchange.com/questions/8450/how-is-th...

If you just want to argue with people, feel free. But based on how this conversation has been going it doesn't seem like you want to learn.

Setting your condescension aside, I browsed the thread.

I understand that calculating trajectories is difficult.

If someone claims something like a 3% impact probability, and they are wrong 99.999% of the time, that speaks to a methodological error in how the numbers are conveyed and or defined.

I work in medical devices and testing. I perform tests like X percentage of patients will die based on the statistical calculations. You may undergo treatment with a medical device that I have worked on.

Calculating trajectories is easy. Getting good data points is hard. Two pictures using a telescope on back to back nights is probably the smallest reasonable sample one could get. Take another picture the 3rd night and you've just doubled the size of the arc.

Wait a week and get another sample and your arc is now approx 5x as long. Wait a month and get another and now your arc is 30x as long as the original. More observations shrink your error bars.

There are systemic errors here for sure. Two kinds, really:

1. Limits of resolution of telescopes 2. Short sample lengths

You absolutely can't do anything about error type 1. You can fix 2 by getting more data. But there's no point in getting data on asteroids that have absolutely no possibility of hitting. So only asteroids that have some probability with limited measurements get enough better measurements that are high quality in order to find out where they're really headed.

All of these measurements of trajectories are completely uncorrelated, so you can't use the priors to adjust probabilities. I mean you can do whatever you want, but we haven't been hit by a big asteroid yet since we've had telescopes and tracking databases.

If we made adjustments based on priors we'd have to discount all collisions down to 0 irrespective of the trajectories. Seems absurd, so there must be something else going on here.

That conclusion may be too early to reach with confidence, based on the limited data!