Hacker News new | ask | show | jobs
by jeffharris 462 days ago
It's a slightly better model for TTS. With extra training focusing on reading the script exactly as written.

e.g. the audio-preview model when given instruction to speak "What is the capital of Italy" would often speak "Rome". This model should be much better in that regard

= No plans to have localized voice models, but we do want to bring expand the menu of voices with voices that are best at different accents

1 comments

Great to hear thanks. My favorite was "I would like you to repeat the following in an Australian accent: Hi there, welcome to Sydney." which was more often than not swapping "Hi there" for "G'day"!