MultiModal-GPT: A Vision and Language Model for Dialogue with Humans

Y	Hacker News new \| ask \| show \| jobs

	MultiModal-GPT: A Vision and Language Model for Dialogue with Humans (github.com)
	4 points by vov_or 1135 days ago

1 comments

Guys trained a multi-modal chatbot with visual and language instructions based on the open-source multi-modal model OpenFlamingo!