Hacker News new | ask | show | jobs
by desro 1467 days ago
Some of the issues seemingly stem from the model's either poor or mis-understanding of the input language... I wonder what a fusion of DALL-E + GPT3 or LaMBDA, where the text-based models perform prompt interpretations, would look like.

This may be a naïve thought as my understanding of all models mentioned is superficial at best.

1 comments

The text input comprehension is (supposedly) much better in Google's "DALL-E 2", https://imagen.research.google/