Hacker News new | ask | show | jobs
by mk_stjames 1234 days ago
This is literally trained on and is processing at the level of the audio bitstream. So, getting it to 'spit out midi' would be no different than the current task of taking a full, mixed audio track and generating MIDI from it (which isn't easy). This is using a transformer architecture directly on tokenization of .wav audio. There is no underlaying stage of 'instrumentation' like a MIDI track is before it gets synthesized into an audio stream.
1 comments

Which is a Bad Thing, because it limits the ability to use it as an educational tool by anyone who wants to learn or modify the music.