Hacker News new | ask | show | jobs
by ericye16 478 days ago
This was the basis of a project I did for my deep reinforcement learning class!

https://ericye16.com/stanford-cs224r

We were able to make some improvements by tuning how the reward is distributed and also by first pretraining the agent on scales before fine-tuning them on the final pieces.

Thanks to Kevin Zakka for helping us get started with the RL environment!

1 comments

did you guys ever try having the agents play the song slower at first?
We definitely tried extending the lookahead, but I don't think we tried having a curriculum-style thing where we gradually increased the speed of the song. Great idea though!