I think it works well :) Which architecture is this based on? Did you consider any trips or fine-tuning to generate images that are perceptually similar to art?
It is based on VQGAN (trained on imagenet) and Open AI's CLIP.
To make it more similar to art you would need to do some "prompt engineering. Try: "painting of ..." or "... painted by Van Gogh"