| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by hereonout2 481 days ago
	This is an interesting take, and I'd guess that the training data for this probably did use podcasts as a source. Getting very realistic / real world conversational training data for an ai would be hard. Only a subset of us appear on podcasts, radio or tv and probably all speak in a slightly artificial manner when we do.

2 comments

scoot 481 days ago

When I commented on the unnatural cadence, it told me that it had been trained on podcasts, which does help explain the issue - some people tend to “live-edit” themselves when a conversation is being recorded, which leads to this staccato. It seems they need to find a better source of training date for more natural conversational speech.

link

jofzar 481 days ago

I agree, I thinks it's probably very easy to find billions of hours of conversation on YouTube, but non of it is set to training data with a good transcript.

link

hereonout2 481 days ago

Yep! it's public dialogue, intended for an audience with a prepared topic, etc. Or it's actors imitating private dialogue, but again shaping it towards an audience.

AI agents like this are trying to recreate personal intimacy I guess, which does feel like it might be different somehow.

link