| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by lefthansolo 5010 days ago
	This is a nice one, however I'm still confounded by the lack of progress since bell labs made an online text to speech converter many years ago. Particularly, the notion that the interpretation of each sentence is idempotent is just wrong. Want to see what I mean? A human would not speak like the following; there should be differences in intonation, "emotion" (sounding bored, angry, excited, etc. that varies depending on the number of times "dogs" would be said), speed, and delay. In addition, you have to breathe at some point, and even the best audiobooks have some level of breath noise. http://tts-api.com/tts.mp3?q=dogs.%20dogs.%20dogs.%20dogs.%2....

1 comments