Hacker News new | ask | show | jobs
by leereeves 3121 days ago
> If the reporting jurisdictions are missing from the dataset not because there were truly no killings in those areas during this period, but instead because they chose not to report homicides by police

That's a rather large assumption that needs to be justified.

1 comments

It is justified, both in the citation and the actual text. The ARD dataset is driven by self-reporting from various local agencies. The fact that the FBI's SHR data and the ARD dataset have a mismatch (that is, there are police-related homicides in SHR that are not present in the ARD data and vice versa) is proof enough that there is underreporting in these datasets!
I just realized that underreporting is already accounted for in the 1,250/y thanks to the statistical analysis described in the article.

A = the number of jurisdiction-reported homicides

B = the number of media-reported homicides

M = the number of homicides on both lists

N = AB / M

Now, if jurisdiction-reported homicides are unreported by a factor of X, we can derive a more accurate figure for A by multiplying A by X. We also multiply M by X, because adding cases to list A also adds a similar ratio (on average) to the matches between both lists. And the estimate doesn't change.

N = (XA * B) / XM = AB / M

This assumes, of course, that homicides in the jurisdictions that don't report to the FBI or BJS are still reported by the media. That may not be true but if it's not true, it must be proven false.