Hacker News new | ask | show | jobs
by dionidium 1531 days ago
It's notoriously difficult to make region-to-region comparisons using city crime data, so much so that the FBI explicitly warns against it on their crime-reporting website. [0]

The problem is that things change a lot from neighborhood to neighborhood -- the worst neighborhoods for crime are much worse than the best neighborhoods for crime. Second, municipal boundaries are mostly historical accidents; cities don't stop at their borders. Some cities draw their borders to include almost their entire surrounding region; others (like San Francisco) include a very tiny portion of their region. Depending on where the crime in the region happens to be, this can make a big difference. Why is Brooklyn part of NYC, but Jersey isn't? Why is Oakland not part of San Francisco, but Queens is part of NYC? Why are all the old inner-ring suburbs in Chicago a part of Chicago-proper, while none of the old inner-ring suburbs of St. Louis are part of St. Louis-proper?

These are just historical accidents that don't matter much if you're a human being walking around, but they have huge impacts on how statistics are compiled in each region.

So, OK, say you solve that problem by only looking at MSAs or "urbanized areas" so that you can normalize the comparison between "cities." That solves the problem, right?

Does it?

You still have the issue that you might be comparing one region with a relatively high rate of crime spread across its entire footprint to another city that has sky-high rates of crime in 5 neighborhoods and is relatively safe everywhere else. Which city has "higher" crime? Can you tell just by looking at its region-wide per-capita crime rates? Is that even a question statistics can answer, or is it philosophy? (See: the old joke about Bill Gates walking into a coffeeshop and drastically raising the average income of everybody inside.)

It's a hard problem.

[0] I see now that someone else already posted the link: https://news.ycombinator.com/item?id=30958622

1 comments

Yes, agreed that it's a hard problem, bordering on philosophy. Feeling safe, is a unique individual experience that won't show up in the data.

But in response to the claim from the article that "SF is not safe", I don't know a better way to analyze the claim that crime stats, even though it's imperfect for all the reasons you mentioned.