Hacker News new | ask | show | jobs
by redka 1643 days ago
For anyone interested I've transcribed this song [1] using the replicate link the author provided (Colab throws errors for me) using mode music-piano-v2. It spits out mp3s there instead of midis so you can hear how it did [2]

[1] https://www.youtube.com/watch?v=h-eEZGun2PM [2] https://replicate.com/p/qr4lfzsqafc3rbprwmvg2cw5ve

3 comments

Awesome, thanks for running the test!

I can't help but feel it is heavily impacted by ambience of the recording as well. The midi is of course a very rigid and literal interpretation of what the model is hearing as pitches over time, but of course it lacks the subtlety of realizing a pitch is sustaining because of an ambient effect, or that the attach is is actually a little bit before the beginning of the pitch, etc.

If it could be enhanced to consider such things, I bet you would get much cleaner, more machine-like midis, which are generally preferable.

Listening to the input and output, the reproduction is like comparing a cat to a picture that resembles a cat, but isn't a cat.
Music's a lot more than a collection of notes ... and the timbre of one piano is about as far away from a mixture of reeds and dulcit electronics as you can get. (The Fulero is very nice.)
How do you get a midi?
You'd have to run it yourself. There's a docker image available but it's a pretty big download (11.7GB)