| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by nopelynopington 263 days ago
	If this lives up to the demo it's a huge development for anyone looking to do realistic tts without paying to use an API

1 comments

kristopolous 259 days ago

there's quite a number of pretty low overhead models around that do that in realtime these days.

link

MarsIronPI 259 days ago

But how many of them support voice cloning?

(Genuine question; I haven't seen any other than this one.)

link

nickthegreek 258 days ago

microsoft’s vibe voice.

link

MarsIronPI 258 days ago

VibeVoice (according to the repo description) is currently unavailable due to "misuse". But my impression was that it required a significant (>8GB) amount of VRAM? Or that it wasn't suitable for on-device for devices with low specs.

link

nickthegreek 257 days ago

its unavailable from their repo, but was released with an open license and mirrors exist. I'm not sure what the VRAM req are.

link

MarsIronPI 257 days ago

According to this issue[0] the 1.5B model needs 6GB of VRAM. Meanwhile it looks like NeuTTS is designed to be able to run on CPU, which is nice for older/lower-spec hardware.

0: https://github.com/microsoft/VibeVoice/issues/26#issuecommen...

link

foofoo12 259 days ago

link