Hacker News new | ask | show | jobs
by not2b 934 days ago
You seem to be assuming, without any evidence at all, that LLMs giving medical advice are likely to be roughly equivalent in accuracy to doctors who are actually examining the patient and not just processing language, just because you are aware that medical mistakes are common.
1 comments

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10425828/ Use of GPT-4 to Analyze Medical Records of Patients With Extensive Investigations and Delayed Diagnosis

"Six patients 65 years or older (2 women and 4 men) were included in the analysis. The accuracy of the primary diagnoses made by GPT-4, clinicians, and Isabel DDx Companion was 4 of 6 patients (66.7%), 2 of 6 patients (33.3%), and 0 patients, respectively. If including differential diagnoses, the accuracy was 5 of 6 (83.3%) for GPT-4, 3 of 6 (50.0%) for clinicians, and 2 of 6 (33.3%) for Isabel DDx Companion"

Six patients is a long way from persuasive evidence, because with so few patients randomness is going to be a large factor. And it appears that the six were chosen from the set of patients that doctors were having trouble diagnosing, which may put a thumb on the scale against doctors. But yes, it certainly suggests that a larger study might be worth doing (also including patients diagnosed correctly by doctors, to catch cases where GPT-4 doesn't do as well).