| I don't understand what you're getting at with this list. 1. Yes, datasets need to be cleaned. But you need to have the dataset before you can clean it, and different people will want to clean it in different ways. Get it up there first, and keep the political debates confined to the gathering methods. Griping about raw datasets only gives them an excuse to keep delaying putting anything out (in other words, this critique is actively harming the movement, please stop making it). 2. I don't understand what you mean by this. If a link points to a high-quality dataset that's otherwise hard to find, then it's very valuable. 3. Not all data is expressible in tab-delimited ascii tables. I'd like my SEC filings in well-structured XML, for instance. 4. This is a strawman. Nobody serious has ever said a good data set is easy to use and understand. 5. Ironically, this is the one point you make I agree with, and then you claim it doesn't apply to data.gov. I think this is actually the worst thing about data.gov right now, that they think they're giving us anything when they post their little summaries. Give us the raw data, please. 6. Isn't this just restating a combination of #1 and #3? Yes, big clean monolithic data sets are nice, but the priority is getting access to the data in the first place. 7. You're restating #4, which was a strawman. |