Hacker News new | ask | show | jobs
by nl 1207 days ago
Without a name or other PII this seems like a misplaced concern.
3 comments

It's trivial to de-identify someone from shockingly limited data. Just by submission to an outside service, they know a date, location, and whatever information was submitted. That's plenty, especially assuming they referenced a procedure or comorbidities.
A date of what? The request to ChatGPT to generate an imaginary appeal?

So you know the individual was rejected prior to that. Probably. Maybe.

And maybe roughly where the individual is located, because ChatGPT sees an IP off of… something.

So somewhere, someone asked ChatGPT about a rejected treatment or procedure appeal.

Now if the doctor provides a poorly written appeal and asks for it to be fixed, that is another case entirely, especially if they left in patient info. But there is a very large gap between these two situations, and the first one isn’t nearly as much as you suggest.

That is an obviously incorrect assumption, it is possible to de-anonomize most data sets about and there is reason to believe this one is no different. Health data by it's nature is very personal and specific.
Doctors publish case studies all the time which contain anonymized data. Presumably those go through reviews to make sure that nothing is being leaked but health data by it's nature is specific but not very personal (at least not identifiable).

Also, depending on what you're using ChatGPT for, this is no worse than Googling something which doctors do a lot as well.

>consent
Informed consent isn't a legal requirement. It's down to ethics and occasionally the journal's publishing requirements. So it we bring the analogy back to ChatGPT, using it for queries isn't breaking HIPAA or any laws.
This is the prompt that was listed.

“Write an appeal letter to a medical insurance company for a patient who needs a biopsy for a bone lesion given prior unclear diagnosis.”

Add an arbitrary ip address and timestamp and you are very far away from anything personally identifying. (Where does your computer suggest you are right now?)

De-anonomizing is usually done by combining datasets. That seems unlikely here.
Why would a letter to an insurance company on behalf of a patient not include PII? The letter itself is surely mostly PII. And is almost certain to contain privileged information.
Probably not. You fill in the PII afterwards and it’s essentially just metadata. Patient has this condition, needs this test, has had this and that happen” those things aren’t PII. Leave out names, places, ID numbers, etc and you have appropriately deidentified a document.
find replace name with ABCEDFG HIJKLMNOP

feed to ChatGPT

find replace ABCEDFG HIJKLMNOP with name?

Well obviously, but you don't need someone's name or SSN to identify them. A sufficiently detailed medical story can be much more than sufficient.
How do you envision that happening? There’s no big database of people’s medical conditions out there that you can use to lookup their name and address, so it’s kind of like an impossible reverse lookup that would need to be done right? Unless you’re talking about a state level actor or something that is tailing someone’s movements around like Jason Bourne and cross referencing it with medical GPT queries, but in that case you’re gonna be compromised anyway in probably far easier ways.
Maybe but not as far as regulations go.