|
|
|
|
|
by xavriley
1120 days ago
|
|
There’s a model for music transcription (audio to midi) called MT3 which takes an end-to-end transformer approach and claims SOTA on some datasets. However, from my own research and comparing with other models it seems that MT3 is very prone to overfitting and the real world results are not as impressive. A similar story seems to be playing out in the comments here |
|