| HN Mirror

It's worth pointing out that anything with patient IDs (even though all personal info is removed) is still not really anonymous "safe" data, and should be treated carefully.

I'm not sure what measures Mimic takes (beyond just stripping demographic info), but a given patient's pattern of healthcare interactions make quite a unique fingerprint -- and some parts of that fingerprint are likely public information for some patients.

E.g., imagine a celebrity who (you can find from the tabloids) was treated at X hospital for a sprained ankle on 2012-07-14, and gave birth to a daughter at Hospital Y on 2014-04-01. If her record -- completely "anonymized" -- is in a data set that lets you search for patients matching these two events... it seems fairly likely you'd be able to narrow it down to only a few candidates, or quite likely an exact match. And then once you have her pseudonym/ID, does the rest of the record reveal anything interesting? An abortion no one knew about (possibly not even her partner)? A venereal disease treatment?

Even the fact that a patient had an appointment at a given clinic is sensitive data -- e.g., seeing an IVF specialist, or oncologist, etc..

It's a tricky field to navigate.