| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by slow_numbnut 949 days ago
	Instead of sacrificing flexibility by building one monolith model that does Audio to audio in one go, wouldn't it be better to train a model that handles conversing with the user (knows when the user is done talking, when it's hearing itself, etc) and leave the thinking to other, more generic models?

1 comments

You don't lose flexibility with an end to end model. You lose controllability. But there are ways to mitigate that.