Hacker News new | ask | show | jobs
by fluidcruft 1550 days ago
The problem we generally have is that plugging the vendor's [insert tensorflow model component] into our network seems to always become an operational no-go prior to purchase due to a variety of reasons including intrusiveness and questions about privacy and the vendor's ability to manipulate the process to get access to datasets. So it's actually the preprocessing step that's we keep hitting as the pain point. In some cases we generate de-identified datasets for demonstration and testing but it can be very labor intensive.

I've not encountered differential privacy in my work before now, but at least for dealing with metadata in the DICOM it could probably be helpful for some datasets. But it could still be challenging to ensure the IODs are correct (or that known quirks are preserved). Anyway this is very interesting. I have a colleague who is working on some utilization/value research using billing records and I'll show him this.

1 comments

Thanks! Our goal is that no matter what preprocessing function they pass, the only end up accessing outputs that comply with the privacy policies. The code gets access to the real data but it is shielded from the vendor who can only see protected outputs. It should address the risk of private information being exposed to them, but for sure, the more sophisticated the preprocessing code will be, the more challenging it will become. Deep learning on Dicom data is pushing the system to the edge a bit.