| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by TheOtherHobbes 3574 days ago

I can hear some distortion in the piano notes - which may be an audio compression artefact, or it may be the output of the resynthesis process.

If you train NNs at the phrase level and overfit, then you get something that is indeed more or less the same as cross-fading at random between short sections.

Piano music is very idiomatic, so you'll capture some typical piano gestures that way.

But I'd be surprised if the music stays listenable for long. Classical music has big structures, and there's a difference between recognising letters (notes), recognising phrases (short sentences), recognising paragraphs (phrase structures), and parsing an entire piece (a novel or short story with characters and multiple plot lines.)

Corpus methods don't work very well for non-trivial music, because there's surprisingly little consistency at the more complex levels.

NN synthesis could be an interesting thing though. If you trained an NN on $sounds$ at various pitches and velocity levels, you might be able to squeeze a large and complex collection of samples into a compressed data set.

Even if the output isn't very realistic, you'd still get something unusual and interesting.

1 comments

Scaevolus 3574 days ago

The samples are uncompressed WAV files, so everything you hear is a direct result of the synthesis process. Some of the distortion is a result of the 16kHz sample rate-- it's not 44.1kHz CD quality.

link

IanCal 3573 days ago

It's quantized to just 256 values though, which could be causing some of the distortion.

link