The audio recognition bit is indeed basic - all you need to figure out is the note and possibly the length of the note.
The following part hard is a bit harder, but not impossible - compare what you have recognized with what you was expecting. If it's not what you expected, look around the last known position and see if you can find the pattern you recognized (while taking repeat signs etc. into consideration).
The following part hard is a bit harder, but not impossible - compare what you have recognized with what you was expecting. If it's not what you expected, look around the last known position and see if you can find the pattern you recognized (while taking repeat signs etc. into consideration).