| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by Spivak 810 days ago
	It's an impressive demo, it's not (yet) an impressive product. It seems like the people who are ohhing and ahhing at the former and the people who are frustrated that this kind of this is unbelivably impractical to productize will be doomed to talk past one another forever. The text generation models, image generation models, speech-to-text and text-to-speech have reached impressive product stages. Multi-model hasn't got there because no one is really sure what to actually do with the thing outside of make cool demos.

1 comments

0xB31B1B 810 days ago

Multi modal isn't there because "this is an image of a green plant" is viable in a demo, but its not commercially viable. "This is an image of a monstera deliciosa" is commercially viable, but not yet demoable. The models need to improve to be usable.

link