| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by grenoire 2282 days ago
	Can anybody explain why the researchers are attempting to generate the whole song as a single waveform, as opposed to wiring generated MIDI into some instruments and separately a singing algorithm (perhaps a bit easier than the whole bulk work)?

3 comments

mcleaveypayne 2282 days ago

We did work last year on MIDI alone - https://openai.com/blog/musenet/ and some early work now on conditioning the raw audio based on MIDI (early results at the bottom of the Jukebox blog). Agreed though there should be interesting results from modeling different blends of MIDI, stem, and raw audio data. Raw audio alone gives us the most flexibility in terms of the kinds of sounds we can create, but it's also the most challenging to get good long term structure. Still lots more work to be done!

link

mycall 2279 days ago

Something like MOD/XM music comes to mind.

link

zeroxfe 2282 days ago

It's very hard to express all the nuances of real music and tonality in MIDI -- so generating raw audio side-steps all the limitations of a MIDI intermediary, and IMO, the results are absolutely phenomenal!

(BTW, there are lots of AI music generators that generate MIDI, so it's less interesting either way.)

link

TaylorAlexander 2282 days ago

Well it’s not midi but what you’re describing is similar to this approach:

https://magenta.tensorflow.org/ddsp

link