|
|
|
|
|
by opsiprogram
2466 days ago
|
|
How would you combine HIPAA with another data source to identify the individual? Not suggesting it can't be done, just wondering how one might do that? Being able to link data that can identify a person to some de-identified would only be possible if the original data was not properly de-identified right? |
|
Consider the following de-identified data sets:
- [date, time, clinic, procedure or test being done, insurer] - as collected by the clinic chain so that it can get money from insurers
- [month, clinic, test name, test result] - for all tests made in the last year, collected for statistical purposes
- [date, time, latitude, longitude, phone number] - because AFAIR telcos sell this data
- [name, surname, phone number, ...] - some insurance company's list of customers
If you can get your hands on these datasets, you can trivially de-identify patients and even assign test results to them with high probability (that depends on how many tests of a given type are made in any given clinic per the unit of time used to group the second data set).
Real-world data sets may be less clear-cut than this, but there is more of it, and you can apply statistical methods to find correlations. You don't need to be 100% sure customer X has diabetes for the information to be useful to you; 70% or 60% is useful too.