That's an interesting example! Perhaps too out-of-distribution, though. For fair comparison with other methods, we used the DIV2K training set in our paper, which only comprises 800 images. Would be cool to train a version on a much bigger set, potentially including images similar to what you tried :)
As other have mentioned, this models just puts emphasis on pixels and compression artifacts, so it's of not much use for improving old or low quality images.
I tried doing some pixelart->HD conversion with Gemini2.0Flash instead and the results look quite promising:
The images are however all over the place, as it doesn't seem to stick very close to the prompt. Trying to fine tune the image with further chatting often leads to overexposed looking pictures.
All the results are done with prompts along the lines of "here is a pixelart image convert it into a photo" or some variation there of. No img2img, LoRA or anything here, all plain Gemini chat.