| I can hear some distortion in the piano notes - which may be an audio compression artefact, or it may be the output of the resynthesis process. If you train NNs at the phrase level and overfit, then you get something that is indeed more or less the same as cross-fading at random between short sections. Piano music is very idiomatic, so you'll capture some typical piano gestures that way. But I'd be surprised if the music stays listenable for long. Classical music has big structures, and there's a difference between recognising letters (notes), recognising phrases (short sentences), recognising paragraphs (phrase structures), and parsing an entire piece (a novel or short story with characters and multiple plot lines.) Corpus methods don't work very well for non-trivial music, because there's surprisingly little consistency at the more complex levels. NN synthesis could be an interesting thing though. If you trained an NN on $sounds$ at various pitches and velocity levels, you might be able to squeeze a large and complex collection of samples into a compressed data set. Even if the output isn't very realistic, you'd still get something unusual and interesting. |