| HN Mirror

Ah thank you, that's very interesting.

So that's not markup along "emotional" lines, but rather along "technical" attributes such as speed, pitch, volume, pause between words, and so on.

Obviously coding those things in XML manually would be a nightmare. Now I find myself wondering if 1) these technical parameters can be used to synthesize speech that does sound like a reasonable approximation of emotion (or if they're insufficient because changes in resonance and timbre are crucial too), and 2) if there are tools that can translate, say, 100 different basic emotional descriptions ("excitedly curious", "depressed but making effort to show interest", etc.) into the appropriate technical parameters so it would be usable.

Anyways, just a fascinating area of study.