I believe that Daniel Huttenlocher and Pedro Felzenszwalb should be credited for the multi-pass (first X, then Y) transform based on quadratic distance:
That second paper from 1996 references an even older paper from 94, saying “Dividing rows and columns alternately, Chen and Chuang reduced the time complexity to O(N^2) which is optimal.”
Here are a few that predate and I think make the same observation:
https://dl.acm.org/doi/10.1016/j.ipl.2006.12.005
https://www.sciencedirect.com/science/article/abs/pii/002001...
That second paper from 1996 references an even older paper from 94, saying “Dividing rows and columns alternately, Chen and Chuang reduced the time complexity to O(N^2) which is optimal.”
https://www.sciencedirect.com/science/article/abs/pii/002001...