Hacker News new | ask | show | jobs
by xavriley 1120 days ago
There’s a model for music transcription (audio to midi) called MT3 which takes an end-to-end transformer approach and claims SOTA on some datasets. However, from my own research and comparing with other models it seems that MT3 is very prone to overfitting and the real world results are not as impressive. A similar story seems to be playing out in the comments here
1 comments

What would you say is a good model for audio to midi transcription?