| Hi savanaly, When we talk about "what makes Bach sound like Bach," the technical concept we have in mind is the recent work in computer vision on style transfer. For example, https://arxiv.org/abs/1508.06576 We are excited to work on adapting these models to the musical domain! As for note prediction, you can see our results in our paper: https://arxiv.org/abs/1611.09827 Our are results for simple (2-layer, not very "deep" models); we were interested in understanding the low-level "features" of music rather than building a model that maximizes performance. Nevertheless, the results are quite promising; I'm confident that someone using our dataset with a deep network and a lot of gpus could blow our numbers out of the water! :) Tutorials on how to set up and evaluate this task are available on our website: http://homes.cs.washington.edu/~thickstn/start.html |
I am starting to record my own dataset for solo jazz piano - all midi though. Monophonic melodies, and matching chord voicings and voice leading from one chord to the next. With the goal of learning to generate a good sounding jazz piano arrangement to a given melody with nothing except monophonic input.
Style transfer is good at essentially texture transfer - I suspect it won't work that well for understanding music theory (or text), especially with long time series dependencies, but will be very curious to see what emerges.
I'd like to hear more generative music samples from DeepMind's WaveNet too, the piano samples they published sounded very good, but it was unclear what the model had learned or generalised - and how much was semi-randomised recall. I haven't seen the open source implementations of WaveNet produce as good results yet - probably because it's computationally very expensive to train and run, and that limits experimentation. I saw AƤron give a talk on it a couple of weeks ago which helped me understand the stacked dilated convolutions - but would still like to hear more music examples :)