I'm sorry, but I don't get it. I mean, if what you want is an accurate color reconstruction, for scientific purpose, then yes, you probably need depth information. If what you want is simply to colorise an image inferring the correct colors from a mix of residual color, blue shift, and a general knowledge about the object itself (the color of a seaweed, a starfish, etc.) then you can probably do it pretty well from a single image. There are neural networks that can colorise a b&w picture with zero color data available(1), and in this case you just want to enhance existing colors, so it should be much easier. Of course you'll get a representation with no claim of accuracy, but it should be fine for many purposes.
(1) often with pretty bad results, but consider the variety of scenes and object above water compared to the average underwater scenario.
(1) often with pretty bad results, but consider the variety of scenes and object above water compared to the average underwater scenario.