| No, Markov chains at least cannot work because they are fundamentally finite state machines with no global state. Say you want to generate 'verse chorus verse slightly-different-chorus', which is an idea that I've seen in basically every type of music that I've listened to. If you want to generate a slightly different version of the first chorus, you need knowledge of the first chorus, which is not possible with a Markov chain unless the state that represents the start of the second chorus is only possible to reach given that the first chorus was generated; i.e. you need to code in every second chorus possible into your Markov chain, i.e. you need to code in every first chorus possible into your Markov chain, i.e. the human's composed the piece. The thing with computer generated music is that music is complicated; it's fundamentally not just a set of rules that you can apply and get good music. Yes, counterpoint does have many rules and suggestions that can restrict you, but they don't specify all good music. In the same way that if you start combining logical axioms and inference rules, you generally just get random useless theorems, combining musical rules in an unstructured way is pretty much guaranteed to get you useless sequences of locally-alright notes. The correct way of using the rules is (with logic) to start at the conjecture you want to prove and use the computer to prove the theorem correct by working backwards. With counterpoint, it's to compose the music, click the 'check for mistakes' button in Sibelius and check that you haven't made any glaring errors. |
If you use CRFs you can condition on the whole piece and learn the model like that. Yes, you'll have to use a lot of data but models can be as global as you need them to be.
If you want to use a 'verse chorus verse slightly-different-chorus' way of composing, yes, you can use a first level of a chain to generate the probable musical sequence blocks, and generate each block separately, using at the same time features generated in each part (verse, chorus, slightly-different-chorus etc.) to keep the same feeling.
If you train your model in a way described above, you can then pick a tune in your head, put it in and ask the model to generate the most probable sequence for the whole song. Or, if you're using CRFs with Gibbs sampling, you start from the complete piece and iterate until the probability it fits is large enough. Same could be done, somewhat easily, with Hidden Markov Models (I just realised that Markov models might not be the thing I was referring to in the post above, I was talking about statistical variant of Markov chain).
Convolutional neural networks could do the same thing, probably even better than CRFs and HMMs. Music isn't more complex than language and people have been using these sequence modelling methods to do extraordinary things in natural language processing.