| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by jmvalin 842 days ago
	Actually, what we're doing from DRED isn't that far from what you're suggesting. The difference is that we keep more information about the voice/intonation and we don't need the latency that would otherwise be added by an ASR. In the end, the output is still synthesized from higher-level, efficiently compressed information.