Diffsound: Discrete Diffusion Model for Text-to-Sound Generation

Y	Hacker News new \| ask \| show \| jobs

	Diffsound: Discrete Diffusion Model for Text-to-Sound Generation (dongchaoyang.top)
	51 points by selimonder 1410 days ago

3 comments

Twiddling around trying to get this to work. Looks exciting :)

Very cool. Need to dig around and figure out what the training dataset is. Could be a great way to get some sample fodder.

There seems to be a missing python module called "image_synthesis", anyone know more about this?