Hacker News new | ask | show | jobs
by stuartbman 1703 days ago
One of the problems with closed models is that any model can be found to train on the 'wrong' data point. So e.g. a chest x ray reader determines that images taken with the machine in ICU indicate sicker patients than elsewhere- that's not useful. If you can't inspect the model to check that, they might claim superior performance, but then the model doesn't work as well as advertised when it's tried out. Other biases might occur as well- for instance you can imagine a 'Greyball for healthcare' with the wrong incentives which recommends a certain drug/therapy more often than it should.
1 comments

One of my more radical opinions in this area is the idea that it should be illegal to sell a closed and proprietary ML model for areas of public safety, specifically in hospitals and in courts/jails. The public’s interest in transparency in such matters trumps the company’s copyrights. Trained experts get a chance to inspect every drug and every medical device that’s used; why shouldn’t they get to see how a ML model used in a hospital was trained?
Completely agree. I've not seen it tested legally, but the EU now has a 'right to explanation' where automated decisions are made about people. This would prohibit closed ML from most arenas.
But wouldn't that cause problems with disclosure of patient data?

I.e., if you want to explain ML decisions, you'd have to provide the training data, which is sensitive data.

I don't see why the training data would need to be provided. The model would, and that's derived from the training data, but the training data itself shouldn't need to be provided. It is hard to explain a model with any degree of complexity with or without training data, so it might not be easy, but that's just the nature of complex models.
Yes this is why I'm not sure how it's been tested. In the scientific literature in many cases the data is anonymised and made available publicly or to those interested, but you can't always anonymise the data adequately, so an audit process might be necessary
Seems like there should be some informed consent before your identifiable medical data is used for ML training then.