Hacker News new | ask | show | jobs
by cinntaile 1197 days ago
> We then transcribe the consultation using a fine-tuned version of Whisper. We've trained Whisper with tens of thousands of hours of medical consultation and medical terms recordings, and we have now reached an error rate which is 3× lower than Google's Speech-To-Text.

I think the general idea is interesting but this is the wrong benchmark to convince people imo. What is the actual error rate of Google's Speech-to-Text on this your data? How convinced are you that it's properly labeled? That it's 3x lower isn't necessarily impressive. Then a follow up question... What I as a patient would want to know is something else, how does it compare against medical secretaries that usually transcribe documents like these?