Hacker News new | ask | show | jobs
by 0xb100db1ade 2454 days ago
I see where you're coming from, but would the subjects be comfortable with all their data becoming public?
2 comments

I think you could, and should anyways, make the data anonymous. Just give every participant a GUID for a participant ID and add a step to purge personally identifiable information. Then you can share records without identity.
That didn’t work for the AOL research several years ago. https://arstechnica.com/tech-policy/2009/09/your-secrets-liv...
Making things like medical records actually anonymous, especially in the face of bad actors, is an unsolved problem.
Anonymizing data is, yes, a difficult problem, but in particular aggregated data can, and has been, reliably anonymized. For example, the problem with this dataset would have been visible in aggregated data (e.g. aggregated by nationality).