Hacker News new | ask | show | jobs
by treyd 930 days ago
MusicXML seems to be more for notation and sheet music typesetting rather than algorithmic operations on the notes themselves. Sure you could train a model on it but you'd be better off doing it on the specific domain and classically translating up to the XML format.
1 comments

Right, but sheet music is ubiquitous in countless musical contexts and there's very little attention to it from the ML side. Sheet music is somewhat arduous to create and there is definitely room for a lot of automation and ML could help out a lot. I experimented with a tokenizer / GPT-2 (decoder-only) model for MusicXML (https://github.com/jsphweid/xamil) that is able to generate single staff music somewhat coherently. But it's just a first step and I don't care about generating (EDIT: hallucinated) music. Ideally we could add an encoder part to a model like this that takes in MIDI tokens and spits out sheet music. But I haven't gotten that far and don't have the ML chops to do it at this time. But it shouldn't be impossible.
Having an MP3 to sheet music would be even better, but probably 10x harder to do well.
For now, between the state of the art source separation models (e.g. demucs) and transcription models (e.g. Magenta's MT3) the last mile seems to be MIDI -> MusicXML IMO. But yes, I suspect it'll become more end-to-end ML in time.