I've done this kind of stuff through a point and click UI in GIS software. It's really cool seeing a lot of the underlying math and concepts laid out like this.
ESRI software has had this raster function for quite a while, at least 20 years. Usually 2 or 3 points would suffice. Using hundreds of points was unnecessary.
Hundreds of points lets you get a good average. 2 or 3 requires that you've definitely clicked the same point on both images; a human can use other bits of the image to work that out, but a computer finds it harder.