Hacker News new | ask | show | jobs
by pistachiopro 2069 days ago
It's still an interesting paper, but I was disappointed they were "just" concatenating an imagine generator with a language model. I'm really excited for when someone figures out concurrently trained models, say, alternating between training passes of GPT-3 and iGPT, such that the very same attention layers deal with both language and and visual/spatial conceptualization. I expect common sense reasoning capabilities to take a huge leap, at that point.
1 comments

training on language and visual cue at the time is indeed the next important milestone to achieve.