They never seem to mention that in the paper, at least not prominently as I of course skimmed it today. But Photoshop already has a built in tool for this, so I guess they can just use the standard methods that seem to work fairly well.
Judging from the YouTube videos, the novel part is that they can fill out the part that is occluded from the photo (either using textures from the 3D model, or by using InPaint) because they refer to earlier work that already lets you cut out and manipulate the objects using 3D models.