|
|
|
|
|
by josh-pdap
1364 days ago
|
|
1. I replied to the parent comment here; our answer to the scale problem is to recognize that people doing web scraping are as decentralized as the police. Our goal is to empower people who have questions about the police to answer them. 2. You can run them locally. We're not running the scrapers anywhere, or storing extractions anywhere. 3. This is a big, big question. Right now, the answer is dependent on the use case. Rather than trying to make the world's biggest database, we're going to respond to community needs and build this kind of thing as it comes up. 4. https://measuresforjustice.org/ is doing something like this! We're interested in creating incentives for police departments to make their data more accessible and transparent. |
|
What I would expect to see is something like:
1. Here's what data we want from each police department each day. Here's what value you should use to indicate that data is not available.
2. Here's a list of police departments. Write a scraper. If it passes tests to show it's generating valid data, and code review, we will run the scraper in a daily basis storing the output in this database.
3. Here's how you can query our database.