Hacker News new | ask | show | jobs
by recursive 928 days ago
If you want to do ML on notation, then maybe. MIDI or PCM audio might be a better place to start if you want to work directly on the music.
1 comments

Note that MIDI is a lot more effective when it comes to ML/AI, since it's multiple orders of magnitude less data. Daniel D. Johnson's (formerly known as Hexahedria, hired by Google Brain) model biaxial-rnn-music-composition is from 2015, requires very few resources for training or inference, and still delivers compelling, SOTA-or-close results wrt. improvising ("noodling") classical piano. https://github.com/danieldjohnson/biaxial-rnn-music-composit... You may also want to check out user kpister's recent port to Python 3.x and aesara: https://github.com/kpister/biaxial-rnn-music-composition (Hat tip: https://news.ycombinator.com/item?id=30328593 ).

Music generation from notation is pretty much the MINST toy-scale equivalent for sequence/language learning models, it's surprising that there's so little attention being paid to it despite how easy it to get started with.

MIDI is absolutely horrible for ML. It lacks very necessary information such as articulation etc which are important to make sense of music. It's popular because it's simple but there is no way to understand music by just looking at MIDI.

I'm a hobbyist in this space (am a composer myself as well a software engineer) and currently all tools are very poor. MusicXML is better than MIDI. MEI [1] is better than MusicXML etc.

The problem is there is miniscule amount of effort and money spent into this field because music overall makes peanuts. It really doesn't justify training expensive ML algorithms which is unfortunate.

[1] https://music-encoding.org/about/

> MIDI is absolutely horrible for ML. It lacks very necessary information such as articulation etc which are important to make sense of music.

This depends enormously on the instrument. Consider someone playing a piece live on a keyboard: we can keep a MIDI recording of that and we've captured everything about their performance that the audience hears.

> MIDI is absolutely horrible for ML.

It depends what you're trying to do. If you're trying to generate sheet music that's pretty to look at and easily understandable to a performer, then yes obviously it's not enough. If you want notes that will actually sound good when played back, it's hard to beat it.

> If you want notes that will actually sound good when played back, it's hard to beat it.

I strongly disagree with this. There is no good algorithmic music generator trained on MIDI. They all generate elevator music.

Are you aware of the system I linked above? D.D. Johnson has a blogpost https://www.danieldjohnson.com/2015/08/03/composing-music-wi... with plenty of examples of what an instance of his model can generate. It may not be all that "good" in an absolute sense, but it's at least musically interesting, the opposite of elevator music. (There's also a proprietary model/AI called AIVA about which very little is known, but it does seem to be bona-fide AI output - albeit released in versions that have been orchestrated by humans - based on what it sounds like.)
Yes I'm familiar and...

> Here's a taste of things to come

it sounds like randomly generated MIDI... Doesn't sound like anything to me at all.

Music is very subjective but I've so far seen no model that's convincing. If you like it that's cool I suppose. I personally use algorithmic composing plenty in my own compositions (I write music for piano) and these kind of models don't do it for me. They're definitely tools, you can use them like ChatGPT to get a sense of things but we're decades away from producing "music" this way imho.