Figure 12 is wild. They take an image from overhead a scene - a box which contains a mirror (perpendicular to the camera so you can't see the surface) and some other objects. They illuminate from the side (shining into the mirror) and are able to reconstruct that view. That is an image facing the mirror whose surface the camera can't see and it correctly shows reflections.
Did you see Figure 16?? The camera is pointed at the back of a playing card (like, from a 52-card deck), and the front is faintly reflecting light onto a book, and they're able to reconstruct a shockingly clear image of the front of the card (King of Hearts, spoilers).
I haven't fully read through the paper, but it looks like an important part of the technique is that the light acts like a point light at each point in time. For example, the projector will emit certain patterns of light, like in Figure 6 of the paper, to illuminate a localized group of pixels in the camera. Varying this pattern over time, it's possible to reconstruct how a small amount of light exiting the light source affects a small amount of light captured by the camera.
And even with "point" light sources and the parallelized light patterns, the whole scanning takes a while:
> This image is 578x680 pixels and was acquired in a little over 2 hours.
The swapped image has the same resolution as the projector, so it seems yes, you'd get a very low resolution image if you used a large light source like the sun. You can see this in the "Accidental Pinhole" supplemental video. The reconstructed image is very low resolution, but the quality is slightly better when the light comes in via a smaller area (a mostly closed window).