Yes, although not with very low-resolution images (8x8 or 16x16), because then you run out of frequency gap between interesting frequencies and ones that are contributed to by the windowing function. EDIT: Another facet of the same issue viewed from pixel space is that you lose a lot of contribution of pixels close to the edge, and in such small images majority of pixels are close to the edge.
There's also an interesting alternative for images, if you don't care about inverting the transform to get the original image: you can express an image as a sum of a periodic component and a very slowly changing one. The periodic component looks like the original to the human eye, so taking DFT of it is usually sufficient for analysis. The paper that discovered that technique: https://hal.archives-ouvertes.fr/hal-00388020v1/document
There's also an interesting alternative for images, if you don't care about inverting the transform to get the original image: you can express an image as a sum of a periodic component and a very slowly changing one. The periodic component looks like the original to the human eye, so taking DFT of it is usually sufficient for analysis. The paper that discovered that technique: https://hal.archives-ouvertes.fr/hal-00388020v1/document