Hacker News new | ask | show | jobs
by 7373737373 1054 days ago
A high quality speech-to-text and/or text-to-speech program that just works out of the box

Challenge: make it require at most (1) opening a website (2) a single packet manager installation (3) a single app store installation

a good speech-to-text that integrates with programs as we use them would help my hearing-impaired friend understand the world better

and a nice sounding text-to-speech would help me listen to Wikipedia articles

I believe there are already good AI models there that can serve as the core, but they aren't entirely developed (missing features such as "really real time"/instant transcription https://github.com/openai/whisper/discussions/608 and speaker separation: https://github.com/openai/whisper/discussions/264) and I haven't encountered a single good interface for them yet.