Hacker News new | ask | show | jobs
by tedsanders 339 days ago
Interestingly, this is actually a question that's been looked at empirically!

Take a look at this paper: https://scholar.harvard.edu/files/rzeckhauser/files/value_of...

They took high-precision forecasts from a forecasting tournament and rounded them to coarser buckets (nearest 5%, nearest 10%, nearest 33%), to see if the precision was actually conveying any real information. What they found is that if you rounded the forecasts of expert forecasters, Brier scores got consistently worse, suggesting that expert forecast precision at the 5% level is still conveying useful, if noisy, information. They also found that less expert forecasters took less of a hit from rounding their forecasts, which makes sense.

It's a really interesting paper, and they recommend that foreign policy analysts try to increase precision rather than retreating to lumpy buckets like "likely" or "unlikely".

Based on this, it seems totally reasonable for a rationalist to make guesses with single digit precision, and I don't think it's really worth criticizing.

2 comments

Likely vs. unlikely is rounding to 50%. Single digit is rounding to 1%. I don't think the parent was suggesting the former is better than the latter. Even before I read your comment I thought that 5% precision is useful but 1% precision is a silly turn-off, unless that 1% is near the 0% or 100% boundary.
The book Superforecasting documented that for their best forecasters, rounding off that last percent would reliably reduce Brier scores.

Whether rationalists who are publicly commenting actually achieve that level of reliability is an open question. But that humans can be reliable enough in the real world that the last percentage matters, has been demonstrated.

Your comment is incredibly confusing (possibly misleading) because of the key details you've omitted.

> The book Superforecasting documented that for their best forecasters, rounding off that last percent would reliably reduce Brier scores.

Rounding off that last percent... to what, exactly? Are you excluding the exceptions I mentioned (i.e. when you're already close to 0% or 100%?)

Nobody is arguing that 3% -> 4% is insignificant. The argument is over whether 16% -> 15% is significant.

To the nearest 5%, for percentages in that middle range. It is not just 16% -> 15%. But also 46% -> 45%.
Yes so this confirms my point rather than refuting it...
It seems that you reversed your point then. You said before:

Even before I read your comment I thought that 5% precision is useful but 1% precision is a silly turn-off, unless that 1% is near the 0% or 100% boundary.

However what I am saying is that there is real data, involving real predictions, by real people, that demonstrates that there is a measurable statistical loss of accuracy in their predictions if you round off those percentages.

This doesn't mean that any individual prediction is accurate to that percent. But it happens often enough that the last percent really does contain real value.

The most useful frame here is looking at log odds. Going from 15% -> 16% means

-log_2(.15/(1-.15)) -> -log_2(.16/1-.16))

=

2.5 -> 2.39

So saying 16% instead of 15% implies an additional tenth of a bit of evidence in favor (alternatively, 16/15 ~= 1.07 ~= 2^.1).

I don't know if I can weigh in on whether humans should drop a tenth of a bit of evidence to make their conclusion seem less confident. In software (eg. spam detector), dropping that much information to make the conclusion more presentable would probably be a mistake.

I thought single digit means single significant digit, aka rounding to 10%?
I did mean 1%, not sure if I used the right term though, english not being my first language.
Wasn't 16% the example they were talking about? Isn't that two significant digits?

And 16% very much feels ridiculous to a reader when they could've just said 15%.

In context, the "at least 16%" is responding to someone who said 8%, and 16 just happens to be exactly twice 8. I suspect (though I don't know) that Yudkowsky would not have claimed to have a robust way to pick whether 16% or 17% was the better figure.

For what it's worth, I don't think there's anything even slightly wrong with using whatever estimate feels good to you, even if it happens not to fit someone else's criterion for being a nice round number, even if your way of getting the estimate was sticking a finger in the air and saying the first number you thought of. You never make anything more accurate by rounding it[1], and while it's important to keep track of how precise your estimates are I think it's a mistake to try to do that by modifying the numbers. If you have two pieces of information (your best estimate, and how fuzzy it is), you should represent it as two pieces of information[2].

[1] This isn't strictly true, but it's near enough.

[2] Cf. "Pitman's two-bit rule".

> In context, the "at least 16%" is responding to someone who said 8%, and 16 just happens to be exactly twice 8. I suspect (though I don't know) that Yudkowsky would not have claimed to have a robust way to pick whether 16% or 17% was the better figure.

If this was just a way to say "at least double that", that's... fair enough, I guess.

Regarding your other point:

> For what it's worth, I don't think there's anything even slightly wrong with using whatever estimate feels good to you, even if it happens not to fit someone else's criterion for being a nice round number

This is completely missing the point. There absolutely is something wrong with doing this (barring cases like the above where it was just a confusing phrasing of something with less precision like "double that"). The issue has nothing to do with being "nice", it has to do with the significant figures and the error bars.

If you say 20% then it is understood that your error margin is 5%. Even those that don't understand sigfigs still understand that your error margin is < 10%.

If you say 19% then suddenly the understanding becomes that your error margin < 1%. Nobody is going to see that and assume your error bars on it are 5% -- nobody. Which is what makes it a ridiculous estimate. This has nothing to do with being "nice and round" and everything with conveying appropriate confidence.

I'm not missing the point, I'm disagareeing with it. I am saying that the convention that if you say 20% then you are assumed to have an error margin of 5%, while if you say 19% you are assumed to have an error margin of 1%, is a bad convention. It gives you no way to say that the number is 20% with a margin of 1%. It gives you only a very small set of possible degrees-of-uncertainty. It gives you no way to express that actually your best estimate is somewhat below 20% even though you aren't sure it isn't 5% out.

It's true, of course, that if you are talking to people who are going to interpret "20%" as "anywhere between 17.5% and 22.5%" and "19%" as "anywhere between 18.5% and 19.5%", then you should try to avoid giving not-round numbers when your uncertainty is high. And that many people do interpret things that way, because although I think the convention is a bad one it's certainly a common one.

But: that isn't what happened in the case you're complaining about. It was a discussion on Less Wrong, where all the internet-rationalists hang out, and where there is not a convention that giving a not-round number implies high confidence and high precision. Also, I looked up what Yudkowsky actually wrote, and it makes it perfectly clear (explicitly, rather than via convention) that his level of uncertainty was high:

"Ha! Okay then. My probability is at least 16%, though I'd have to think more and Look into Things, and maybe ask for such sad little metrics as are available before I was confident saying how much more."

(Incidentally, in case anyone's similarly salty about the 8% figure that gives context to this one: it wasn't any individual's estimate, it was a Metaculus prediction, and it seems pretty obvious to me that it is not an improvement to report a Metaculus prediction of 8% as "a little under 10%" or whatever.)

My interpretation was that Yudkowski simply doubled Christiano's guess of 8% (as one might say in conversation "oh it's at least double that", but using the actual number)
Aim small, miss small?