Hacker News new | ask | show | jobs
by _m8fo 3181 days ago
Life is so precious. To think any one of us reading this could be gone tomorrow.
1 comments

I'm morbidly curious what the odds are that someone reading this will be gone tomorrow. I'll bet it's non-trivial.
I think you are right: the chances aren't extreme, but are non-trivial. This looks to be a chart that has the mortality data necessary to answer your question (although presumably it also includes "non-natural" causes like suicide and homicide): https://www.ssa.gov/oact/STATS/table4c6.html. The first two columns show age, and the probability that a holder of an American Social Security card who is of that age will die within the next year. For all but the oldest ages, I think it should be accurate to simply divide this by 365 to calculate the chance that an individual of that age will die within 24 hours.

From the chart --- and oversimplifying HN demographics by assuming they match those of chart and consist only of 30-year-old American males --- the probability of death within the year is 0.0015; making each individual's daily chance of death 0.000004, which is about 1 out 250,000. Unless I'm misremembering, to calculate the chance that at least one person out of a group of size N dies, it's easiest to exponentiate the probability of survival (1 - .000004 == 0.999996) ^ N, and then subtract this from 1 to find the chance of at least one death.

If we guess that we readers are one-thousand 30-year-old males, I think that means there is about a half a percent chance that one of us will die before tomorrow ((1 - (0.999996 ^ 1000)) == 0.004). If we instead assume ten-thousand 30-year-old males, then we get about a 4% chance that someone won't be around after tomorrow. If we generously assume a hundred-thousand such readers, then there is about a 30% chance that one of us won't make it another full day. I don't know what the actual readership numbers are for this post (or maybe the grandparent was self-referencing their own comment rather than the main post?) but it seems likely that it's somewhere within that range.

If we use a more realistic age distribution for HN, the probability would go up (older readers increase the probability more than younger readers decrease it). On the other hand, if we assume that HN readers on average have better health care and less risk of violence than randomly chosen Social Security card holders, then the probability would go down. But suicide risk would probably go the other way, so I don't know what the total correction factor would be. Still, I'd guess this estimate would remain in the ballpark. Corrections to my methodology or calculations appreciated.

Nice analysis. One shortcoming is you group all deaths together; expected and unexpected deaths. A considerable chunk of 30 year old's may be expected to die, mostly from cancer.
Given that HN's demographic skews heavily towards young males with high-income jobs, the chances of death tomorrow are probably more slight than for most general audiences.
The odds that a Hacker News user will die tomorrow approaches certainty. The probability that that person has read this particular comment are considerably lower, as every user doesn't read every comment.

I've been playing with the "actuary.py" program and some generated data using awk to come up with estimates based on some very crude assumptions: that there are 100k users, that the ages are uniformly distributed from 20 to 50, and that they are 80% male.

    time ~/bin/actuary.py $(
        gawk 'BEGIN { srand(); for( i=1;i<100000;i++ ) {
            age=20 + int(rand() * 30);
            sex=(rand()>0.8);
            if( sex=="0") sex="m";
            else sex="f";
            printf( "%s%s ", age, sex) }
        }')

    There is a 5%  chance of someone dying within 0.0 years (by 2017).                                                              
    There is a 50% chance of someone dying within 0.0 years (by 2017).                                                              
    There is a 95% chance of someone dying within 0.01 years (by 2017).

That last number tells you that there's a 95% chance of someone dying within 0.01 years, which is to say, 3.65 days.

The script fills out the likelihood that we will all die (we will) ... within a given time:

    There is a 5%  chance of everyone dying within 85.66 years (by 2103).                                                           
    There is a 50% chance of everyone dying within 87.7 years (by 2105).                                                            
    There is a 95% chance of everyone dying within 90.71 years (by 2108).                                                                                                                                                                                           
    Probability of all dying in 1.0 year:   <0.001%                                                                                 100.0                                                                                                                           
    Probability of a death within 1.0 year: >99.99%

The actual daily uniques were 300k, and about 3m monthly, as of about 3 years ago. My 100k population may be a good estimate of the actual number of humans in the daily read. The age distribution almost certainly skews generally younger, but extends older, so don't take the values as gospel, only a general indication.

I also believe the mortality tables Randall uses have been updated since.

The script takes over 6 minutes to run on a rather modest system at 100k individuals.

HN user data (from ~3 years ago):

https://news.ycombinator.com/item?id=9219581

xkcd "actuary.py"

https://blog.xkcd.com/2012/07/12/a-morbid-python-script/comm...