Hacker News new | ask | show | jobs
by LeifCarrotson 2311 days ago
A statistic for everything, but little statistical or scientific literacy: so much data dredging goes on. When someone floats a number like "51 breaking balls with zero missed swings" or "24 straight curveballs" it's never presented with the rate at which this would be expected to occur in the pseudorandom/typical case.

There are close to a million pitches thrown in each season. If someone flipped a coin for every pitch in the 2000s, they would probably get a string of 24 head and a string of 24 tails. Given the number of pitches that have been thrown, and the human tendency to stick with what's working, the only reason that there wouldn't be 24 of one pitch thrown in a row is that they'd deliberately change it up.

2 comments

Probably not quite 24 heads (or tails), but close!

I had to dig into this for work, and doing statistics on runs is surprisingly interesting. Suppose you've got a sequence of $n$ events, each of which 'succeeds' with probability $p$. The expected length of the longest run is approximately $\log_{1/p}(n*(1-p) + 0.577 \ln(1/p) - 1/2$. For a fair coin with $p$ = 0.5, this reduces to log_2(n) - 2/3, which is about 19 for one million events. Amazingly, the variance only weakly depends on n, but is about 2 for p=0.5.

Thus, you're probably not going to see a 24 head run in 1M events. I'm excited I got to use this information, as the project I learned it for was a total bust.

More here: Shilling (1990, College Math. J.) https://www.csun.edu/~hcmth031/tlroh.pdf

It's just 2^24 for probability 1/2, right?
It’s a different calculation Matt is doing. You are calculating the probability of that run happening now. Matt is calculating the expected length of the longest run in 1 million tosses.
Exactly! (Thanks!)

The distinction is important because a sequence of ten heads seems 'rare' in isolation. However, it is not particularly unusual when you go looking for it as a subsequence of some bigger set of trials.

You only need it spelled out if you aren’t very familiar with Major league Baseball. What you are saying is implied.

kershaw’s curveball is considered one of the best curveball’s in baseball.

To get zero swing and miss, is quite a special feat.