Hacker News new | ask | show | jobs
by stagbeetle 3000 days ago
Here is the issue with the OG:

1). He has no sources. [0]

2). Sample size for Uber deaths is 1

3). Miles driven is also sampled at 1

2+3). With only one pair, it is a rate, not "rates" of death by Uber. There has been 1 death by Uber. As opposed to the many deaths motor vehicles accidents have caused

4). Per mile is an arbitrary metric, and in this case, false equivalency due to n=1. One Uber car killed one person. As opposed to millions of cars killing thousands of people. You cannot compare the cumulative results of the many to the single result of the one.

4.1). Per mile is an arbitrary metric. It tells us nothing but how many miles of road must be driven by every single driver until 1 death will happen. How do we measure total miles driven in a practical fashion? We can't. We estimate months later.

4.2) Per mile is an arbitrary metric. It doesn't let us know how quickly deaths happen. Do they happen every 1,000 hours? Every 10 hours? Every 100 years?

6). Comparison is unstandardized. 1 kill per ~100m miles is an aggregate and 1 kill per ~3m miles is an absolute. To normalize the data, you would take all the drivers who killed people, their total aggregate miles driven, and graph them on a standard distribution. Plop Uber in there to get your real likelihood of an Uber killing you compared to a regular human.

I might even do #6 if I can find the data.

[0]https://www-fars.nhtsa.dot.gov/Main/index.aspx

1 comments

Okay... let’s try this way.

Assume we know a) the total number of miles driven in a year by sober humans and b) the number of traffic fatalities by sober humans over a year. (The year part isn’t actually important for this, it could be for all time as far as this is concerned — what is important is the miles). Your observations are the number of miles. Time doesn’t play into this, but you could do the same calculation by hours driven or number of trips, if you have that data. Hours driven and mileage driven are going to be pretty well related, so let’s just use that.

Let’s say this rate is 1 fatality every 100 million miles.

Your question is now: given that we’ve observed one fatality in 3 million miles driven for Uber — is the rate for Uber worse than the rate for humans? (Null hypothesis is that the rates are the same). Another way of saying this is - given the rate of one fatality per 100m miles, what is the likelihood that we’d see one fatality in 3m miles?

If you want to estimate the number of fatalities that will happen over the next X million miles driven, you’d use the Poisson distribution, because this is a rare event over a long time span (or mile-span in this case). Plug in the rate, the number of miles, and you can get a pvalue for each fatality count: none, 1 fatality, 2 fatalities, etc...).

Given this rate (1/100m), you can also calculate the likelihood that there would be 1 fatality in 3 million miles. Turns out, it’s not that likely — suggesting that the fatality rate for Uber is higher than humans [0]. It doesn’t say what the rate is exactly, just that it is likely to be higher. Now it’s possible that the Uber rate is the same as humans, just not all that likely.

https://news.ycombinator.com/item?id=16684764