Hacker News new | ask | show | jobs
by namelosw 996 days ago
Interesting.

This is like DeepFloyd but probably combined with OpenAI's strength in NLP field.

I remember Ilya Sutskever often mentioned how multimodal is important in multiple interviews. ControlNet can produce more impressive results for sure, but the model being able to have strong understanding of multimodals like language and light and space as a unified modal will push the industry forward towards the goal of AGI.