| I'm so excited and curious about this that I can't even structure my thoughts well. They're just flooding my brain. What part of this is pre-coded? What part is being generated? Is the goal to give a program some sheet music (maybe a MIDI file) and it figures out the fingerings[1] and then translates the fingerings into kinematics? Because if that's the goal... Holy forking shirtballs that would be amazing. One of the trickiest things for me as a novice pianist is figuring out the fingerings to a piece. It's like a puzzle you work at until you've figured out what's comfortable. It's all about lookahead. "This section generally goes down so I probably want to begin with my pinky and not my thumb." And if it got really good at that, not only are the fingerings useful, but maybe we could get feedback on how physically demanding a piece is. Another challenge I've discovered as a novice is that it can be surprisingly tricky to look at sheet music or hear a piece and determine if it's as easy as it sounds. Some pieces require some very complex fingering. [1] what pianists call the determination of what fingers go where, not just to play certain notes together, but to ensure you can fluidly and comfortably play the next notes as well. |
The demo you are watching is an agent trained from scratch with reinforcement learning. It has roughly 6 days of experience (10M steps at 20 Hz). The Javascript demo is replaying the policy open loop which is why it's not super robust to disturbances.
Re:fingering: we actually use fingering information to create a dense reward for the agent (otherwise it makes exploration super hard). It would be an exciting future direction to have the agent discover and optimize for fingering that best suits its kinematics :) And beyond that, having RL inform pianists about the difficulty of a piece or even more optimal fingering would be amazing.
We trained a bunch of these policies on roughly 150 songs (baroque, romantic, classical) and we did some analysis in the paper if you're interested: https://kzakka.com/robopianist/robopianist.pdf