|
|
|
|
|
by fluidcruft
1550 days ago
|
|
I think this is interesting but I'm having trouble seeing how it would apply to the sorts of machine learning tasks that are drawing heavy interest in a radiology department. How does it apply to, say, development or testing of image segmentation tools? Quite often vendors want to sell us software and we would very much like to test it at scale on our own data to see whether it's trash or not because procurement is a beast. Does this sort of tool provide that sort of an interface somehow? I can see how it works for tablular data, I'm just not sure how you can guarantee PHI is fuzzed sufficiently in images. |
|
The system will generate a fake dataset with the exact same structure and schema (the information on patients is realistic, the images look reasonable and importantly has the right encoding, size, etc.). The purpose of this fake data is for the vendor to adjust their algorithm to be able to consume your data as it is. The vendor builds up the preprocessing on the fake data and then submit their data job to the API (say a preprocessing function to be applied on each record and a Tensorflow model to be fitted on the data, or just to measure the performance on the data). The preprocessing code runs on the original records, the model would be trained or validated against the real data. In the end they can prove the value of their model without having to get their hands on the real data.