It's very interesting. I tried similar approach just by giving some input text to Stable Diffusion and the images are not satisfactory. So I didn't pursue the idea. Because making a site with serious content, people are very reluctant to get in. But satirical or comical way of telling the stories are attracting them very fast. Like Tick Tok videos.
It currently uses open-ai models for the contents and stable diffusion (using the AI Horde) for the images.