Hacker News new | ask | show | jobs
by eig 284 days ago
While I think most of the examples are incredible...

...the technical graphics (especially text) is generally wrong. Case 16 is an annotated heart and the anatomy is nonsensical. Case 28 with the tallest buildings has the decent images, but has the wrong names, locations, and years.

2 comments

Yeah I think some of them are really more proof of concept than anything.

Case 8 Substitute for ControlNet

The two characters in the final image are VERY obviously not in the instructed set of poses.

Yes, it's Gemini Flash model, meaning it's fast and relatively small and cheap, optimized for performance rather than quality. I would not expect mind-blowing capabilities in fine details from this class of models, but still, even in this regard this model sometimes just surprisingly good.