|
|
|
|
|
by mk_stjames
1234 days ago
|
|
This is literally trained on and is processing at the level of the audio bitstream. So, getting it to 'spit out midi' would be no different than the current task of taking a full, mixed audio track and generating MIDI from it (which isn't easy). This is using a transformer architecture directly on tokenization of .wav audio. There is no underlaying stage of 'instrumentation' like a MIDI track is before it gets synthesized into an audio stream. |
|