I can confirm that FishPlaysPokemon uses OpenCV. There's a wealth of knowledge and tutorials, so if you just want to isolate certain color ranges (what they're doing), I guarantee there are tons of resources. There's also Python bindings to make things even more friendly. After that, it's some simple math to figure out which quadrant it's in and then trigger the correct button press.
Actually they just have to detect more than $x pixels of $color in each region then do a button press on that. I don't think in the abstract that it's all that complex, given that the fish are of starkly different colors that do not occur in the tank or elsewhere in the video feed.