Hacker News new | ask | show | jobs
by zlurker 935 days ago
interesting! what sort of errors did you find and were there any sort of take aways your had other than lack of care by the data provider?
1 comments

Each crime report would specify (among other things) what crime was reported and what city block it had occurred. Using my tool, it was easy to analyze specific crimes over time for specific areas. One of the problems was their database would have multiple spellings for the same location which made it much harder to do the analysis since you had to account for that.
This is a common problem, especially for "crime" datasets and very emblematic of the kind of data capturing processes for each reported "crime" (really, they're some of the worst datasets because of the extent of imperfect information and incentives to capture some "crimes" some vs others). Send it through a geocoder like geocodio to resolve most of that problem. It won't be perfect.