| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by minihat 433 days ago
	How is it possible that text-to-score/notation is lagging text-to-audio in music generation? Generating audio seems wildly more complicated! Since you are working in this space, I wonder if you could comment on my pet theories for why this is true: 1. Not enough training data (scores not available for most songs), or 2. Difficulty with tokenization of musical notation vs. audio

2 comments

Mostly 1 I think. There are a few open source efforts doing what you mentioned https://github.com/EleutherAI/aria

3. Smaller market so fewer people trying to solve it