|
|
|
|
|
by bjourne
2179 days ago
|
|
Great article. Well worth the read! I enjoyed the part about sampling which is a big unsolved problem. To me, techniques like nucleus sampling and temperature sampling feels like hacks to make up for the fact that maximizing for likelihood maybe isn't the goal!? Maybe repetitive gibberish has a higher likelihood than prose written by humans? That Best of sampling decreased text quality indicates that it has. Researches have assumed that the problem would go away with ever growing models. But maybe it won't? I don't agree that generating (symbolic) music would be less sensitive to sampling issues. On the contrary, in my opinion. In text you can often get away with grammatical errors or missing punctuation. But if the pitch or timing of one chord is wrong it's over. The audience instantly hears that it is garbage. Thus, you have to lower the temperature (or probability threshold or what have you) to make the sampling more conservative exacerbating the problem with repeated sequences. Of course, in music you want repetitions. But not too much. The magic number (in Western music) is 4. Fewer repeats makes it feel as if the music jumps around. More repeats makes it feel as if the music is stuck or "looping." |
|