|
|
|
|
|
by minihat
433 days ago
|
|
How is it possible that text-to-score/notation is lagging text-to-audio in music generation? Generating audio seems wildly more complicated! Since you are working in this space, I wonder if you could comment on my pet theories for why this is true: 1. Not enough training data (scores not available for most songs), or 2. Difficulty with tokenization of musical notation vs. audio |
|