Hacker News new | ask | show | jobs
by devereaux 2711 days ago
Yes, I do. In general, I stick to simple things. Enough samples and everything is normal, god bless the law of large numbers.

If the precision of your estimation is not a direct function of the standard deviation, but is a "hidden property of the process that obtained" it, we have much bigger problems that losing "valuable information"

2 comments

> If the precision of your estimation is not a direct function of the standard deviation, but is a "hidden property of the process that obtained" it

I think you're confusing different types of error. There is error between measurements and an inherent error to the device you use to measure. There's also a difference between precision and accuracy.

Standard deviation is the difference in multiple measurements. For example if you measure something 10 times to be 51mm, then your standard deviation is 0.

But that doesn't mean you have no error.

The "property of the process that obtained it" is not hidden. A simple case is a ruler. You have lines on the ruler that tell you certain intervals. If the smallest interval on your ruler is 1mm, then all your calculations can be made to +/- 1mm (that is, up to 30.5cm on a standard 12in ruler). There is nothing hidden about this. All that is being said here is that your measuring device is not perfect.

So using the two errors, we have a measurement of 51mm +/- 1mm (or frequently in a short hand you'd just say 51mm). It would in fact be deceptive to say that your measurement was 51.0mm, because that implies that you have more precision than you actually have (implying that you have on the order of +/- 0.1mm precision).

Assuming we are talking about software, error between measurements is a direct function of the device you use to measure, which is itself close to perfect

Even if we go to the example you give, the measurement should be done n times, each reporting the exact result found like 51.0 51.9 51.95 etc. Even if the decimals are outside the smallest interval of your ruler: take enough of them and you can get closer to the actual length which may be 51.55345 and that you would never have been able to measure anyway without a caliper

The best thing is you can even do that by resampling old measurements (a process called bootstrapping)

So yes, if you remove the tenth of millimeters, you lose information.

What's wrong is not the number, but that custom makes people think 51.0 means 51.0 +- 0.01 or anything else while it was never said like this.

Even ignoring the part where you are measuring past the precision of the measuring tool, you're still wrong. Making multiple measurements and then averaging them does not give you more precision just because you get a non integer (in this example) number. It does not bestow you with more precision.

Why you do multiple measurements is because as humans we are imprecise. Any engineer, wood worker, whatever knows the saying "measure twice, cut once". In your example, the thing you are measuring could well be exactly 52.0000...cm.

If you don't believe me I seriously want you to ask ANY physicist. They can even be an undergraduate (assuming they aren't a freshman) and they should know this. We even use this to figure out what we should spend money on. We can process these errors and determine which measuring device needs more certainty and buy that new device.

This is WHY we have very precise devices. With your method, we could theoretically get measurements to nanometer levels. I can tell you, I would much rather spend 15 minutes making a hundred measurements than spending thousands of dollars on a laser and equipment needed to make precise measurements down to the nm level.

To sum up:

> take enough of them and you can get closer to the actual length which may be 51.55345

NO! This is just dead wrong. It'd be right if you said 51 or 52.

> that you would never have been able to measure anyway without a caliper

No! This is why we have calipers!

> if you remove the tenth of millimeters, you lose information.

Not if you didn't have that information in the first place.

>What's wrong is not the number

Yes it is.

I can only assume: 1) You are trolling, 2) You are really thick headed, and/or 3) You've never taken a physics class. I'm not saying anything here that isn't easily verifiable. I have others backing me up. So if you have no interest in learning, then there is no point for me to continue.

How are you getting a measurement of 51.95mm from a ruler with 1mm ticks? If your ruler is capable of outputting that, and if those digits are actually meaningful (even if they do have error bars) then obviously yes they are definitionally significant digits and you shouldn't round them away.
Good point. Maybe not 51.95, but 51.9 mm you can say if you eyeball the measure to be quite close to the mm mark, but not exactly there.

Repeat enough time and you can interpolate 51.9, which you wouldn't have been if you had thrown away the precise measurement even if the precision is within your measurement errors (of 1 mm here)

Your smallest tick is 1mm, not 0.1mm. You can't measure to 51.9mm. So your measurement is 52mm.
Back in high school in physics we would get minus points if we indicated a too high precision in the numbers we used for calculations, it was considered plain wrong to say 2.232cm if you actually only were able to measure that it's roughly 2cm.
And that's good, because you'll get the wrong answer if you used the number with too many points (nit picking: more decimal points does not mean higher precision).

In fact, this is part of why you'll see physicists do all their reductions with variables and plug in numbers at the very end. This ensures that you doing get (what we could call) floating point errors. You don't have extra numbers hanging around (from real numbers like 1/9 or pi). There are also other benefits to doing this.